AO scramnet

29aug2004

Description
25nov14: scramnet driver on observer cpu fails.


    Scramnet is a reflective memory system consisting of nodes (sbus,vmebus cards) connnected in a ring configuration via fiber optics. Each node contains  memory (128kb in our  case). When any node writes into its memory, the information is propagated  around the ring so it appears in the same memory location in all the other nodes on the ring. Ring transfer rates are 5 to 12mb/sec
    . The data acquisition system at AO uses scramnet to pass information between 3 vme crates (pnt,da,cor) and 2 sun workstations (observer,control). The memory is broken into different blocks. Each block holds status information about a particular device or function. The observer workstation takes the data from scramnet memory and places the data in a shared memory block.

     The use of  "scramnet" was extended at a later time:

Blocks that originate from reflective memory:

   
Blocks that go through reflective memory
block
Struct defined in
/home/phil/vw/h
DataSrc
Notes
pointing
pntProgState.h
vme pnt
pointing info, ra, dec,etc
vertex (az,ch,gr)
agcProgState.h
vme pntp1
az,dome,ch device info. actual encoders, state,etc.
turret
ttProgState.h
vme pntp2
rotary floor info
tiedowns
tieProgState.h
vne pntp3
tiedown info
Downstairs iflo (IF2)
if2ProgState.h
vme da
synths, power, switches for downstairs if/lo
Upstairs iflo (IF1)
if1ProgState.h
vme da
synths, power, switches for upstairs if/lo
Doppler  correction (vxWorks only)
dopState.h

not loaded
pattern procedure info
procState.h

not loaded
planetary radar info
plrdrState.h
pntp3
not yet loaded
sb723 trransmitter info
sb723State.h
pntp3
not yet loaded?
tertiary


not used
Blocks that originate from a program
 block
Struct def:
DataSRc
Notes
Cima executive info
execshm.h

observer2
Only runs when cima gui is active.
Holds experiment setup info: projid, etc..
see multiexec
alfa monitoring  and motor position
alfashm.h
aeroncpu
see alfa_mon
Wapp  dataking info
wappshm.h
wapp
see multiwapp

Notes:

   


Old writeup on the structure of reflective memory scramnet used at ao (.ps)

25nov14: scramnet driver on observer cpu fails.

    On 09nov14 around 5am the ups for the observer computer failed. This caused the observer cpu to reboot.

After reboot, the message log contained the error message:
Nov  9 05:15:02 observer mshmD: program started
Nov  9 05:15:02 observer automountd[135]: server observer1 not responding
Nov  9 05:15:02 observer mshmD: Err error 69992452 calling scrmAoInit. Must be on a scramnet cpu
Nov  9 05:15:02 observer mshmD: Err error 69992452 calling scrmAoInit. Must be on a scramnet cpu
Nov  9 05:15:02 observer mshmD: daemon exiting

about 6 minutes later it looks like someone logged in and manually  restarted the scramnet shared memory daemon:

Nov  9 05:21:43 observer sshd[375]: log: ROOT LOGIN as 'root' from observer2
Nov  9 05:50:59 observer sshd[447]: log: ROOT LOGIN as 'root' from observer2
Nov  9 05:53:03 observer mshmD: program started

Looking back at things, it looks like mshmD did not start the first time because the scramnet board did not initialize properly (i'm not sure why it started the 2nd time).


    :The interim correlator data files have header information that is loaded from the da computer (the if/lo synth frequencies)
and the pointing crate (the telescope position, az, za, etc). This information is passed between the computers on the scramnet
memory/fiber link. The connection topology is:
Looking at the interim cor data files, I see that the if/lo headers (from the da crate) and the pointing hdr (from the pntCrate)
remained constant starting at 6:00am 09nov14. They remained constant until 25nov14 17:00 when i rebooted the observer
computer.

    It looks like the problem was that the observer computer was not passing the scramnet packets on to the cor crate. When i rebooted the
observer computer on 25nov14, it started working again.

This means that:
It also means that all data  taken with the wapps, or the mock spectrometers were not affected:

How the data were affected:

    The setup and running of the experiments is done via ethernet. So all of the experiments were setup and run correctly.
The iflo and pointing monitor data that was loaded into the header was incorrect.

    Interim correlator spectra:

    sband radar:

astronomy Observations that were affected by the problem:

  <-page up
home_~phil