Dewar monitoring
oct 2006
Contents:
Block diagram:
Debugging the dewar monitoring
Software
rcvMNProg- program
to control and read dewar monitor.
Dewar monitoring daily
plots (for the web)
Monitoring the dewar
temperatures in real time
The platform
ethernet
monitor program
History/problems:
Block diagram: (top)
The receiver dewars are outfitted with a monitoring system.
The monitoring system consists of:
- Each dewar has a monitor system. The control/voltage lines from
all
dewars
converge to an AO built multiplexor (on the rotary floor).
- The multiplexor has the lines from all of the dewars coming into
it. It
also receives ttl mux addresses from an hp34970 (in the rfi box in the
right blue cabinet on the rotary floor). The mux decodes the addresses
sent from the hp34970, selects one dewar, and one function to read. It
then passes the voltage of this reading to an analog input device (a/d
converter) on the hp34970.
- The hp34970 is in the rfi box in the right cabinet on the rotary
floor.
It has an digital i/o module and an a/d module inserted in its slots.
The
hp34970 will receive monitor function requests from a computer
(downstairs
rfip1 ) via gpib. It then selects the address in the mux and reads the
analog voltage from the line sent back from the mux. The data is then
passed
to the rfip1 computer downstairs via gpib/ethernet.
- The program rcvMNProg runs on the rfip1 computer to control,
read, and
store the dewar monitoring data. (source code
~phil/vw/datatk/rcvMon/rcvMNProg.c)
controls the dewar monitoring. It runs on the rfip1 computer in the rfi
create. The computer is running vxWorks. The program configures
the
hp34970 and then cycles through all receivers reading all of their
outputs.
This takes about 23 seconds for 1 pass through all receivers. The data
is written to disc (/share/obs4/rcvm/rcvmN).
- Communications connection: The hp34970 is connected via gpib to a
national
instruments gpibenet device that is in the rfi box with the hp34970.
The
gpibenet takes gpib as input (from the hp34970) and sends it out via
ethernet
to the rfip1 computer downstairs. The pieces of equipment used are:
- gpibenet device. This is a 10/100 mb device. We use it in the
10 mbit
mode.
It sits in the rfi box on the rotary floor).
- 10b to10fl transceiver. The gpibenet outputs twisted pair. This
cable
goes
to the transceiver that converts this to fiber. This transceiver is
also
in the rfi box.
- The fiber from the transceiver connects to the rfip1 single
board
computer
via the platform ethernet.
- rfip1
computer. This is a motorola single board computer (sbc). It is the
2nd computer in the rfi crate. The rfi crate is the bottom rfi crate in
the 19 inch rack that also holds the pnt vme crate. This rack is in the
clock room to the left of the door as you enter. The platform ethernet
starts here on the ei interface. rfip1 can access ao net via the bath:
rfibackplaneNetwork -> rficpu -> aoNet.
- From the transceiver in the rfi box to the rfip1 cpu via the
platform
ethernet.
The network is 10 mbits (set by the repeaters and rfip1 interface. The
ethernet path has :
- rfip1 connects to rep1 using its ei interface via a fiber
patch cable
(see rep1
port usage)
- rep1 port 6 connects to rfip1. rep1 port 5 goes upstairs to
rep2
(port11)
via the main fiber cable C1 (see cable1
fiber usage)
- rep2 port12 sends the signal down to the xcvr in the rfi box
in the
turret
room (see rep2
port usage).
Debugging the dewar
monitoring:
(top)
The dewar monitoring has had troubles in the past.
This
has mainly been caused by the communications between the rfip1 computer
and the hp34970 device.
The symptoms:
Some symptoms of dewar monitoring problems:
- The displays are not updating. Either the rcvMNProg is not
running,
there are communications problems, the hp34970 has trouble (it may have
powered off) , or the ao mux is not working.
- The displays work but they are updating very slowly (longer
than one
minute
between updates). This is usually caused by communications problems
between
rfip1 and the hp34970).
Debugging details:
Try the following when trying to debug the dewar
monitoring.
Note that all communications with the rfip1 computer need to be done
from
a computer that knows how to get there (eg observer2).
- See if the rcvMNProg is running on the rfip1 crate: rsh rfip1 i
This prints a list of the currently running programs on rfip1. You
should see:
| NAME |
ENTRY |
tid |
pri |
status |
pc |
sp |
errno |
delay |
| rcvMNProg |
rcvMNProg |
ee9b58 |
140 |
PEND |
2e748 |
ee9460 |
d0003 |
0 |
If you don't see rcvMNProg in the list, you should try starting it by:
- rlogin rfip1
- rcvMNProgStart
- Print out the debug info from the rcvMNProgStart. (The output
needs to
be documented)
rsh rfip1 rcvMNProgDbg .. will print:
rsh rfip1 rcvMNProgDbg
progRunning:1 gdDev:0 lastSec:49537 lastErrno:0 lastErrSec:-1 adrDelay:0
StopRequest:0 CurPosProg:Call GetRcvr
out : 49537.0 outV: 1.0 curRcvr:11 curDewar:3 curMuxAdr:4
outfile:/share/obs4/rcvm/rcvmN prcvrI:0xf01790 needReset:0
rcvListLog:rcvrsToLog.dat numRcvsToLog:8
rcvNumsToLog: 2 5 7 8 9 10 11 12
dewAdrToLog : 1 4 6 7 8 5 3 9
tmSndDev:39136 5.529 39136 5.529 39136 103.328 ms (last,min,max)
tmRdDev :49537 87.961 47209 82.496 43841 174.944 ms (last,min,max)
tmIo : 0 0.000 0 999000.000 0 0.000 ms (last,min,max)
tmAdr :49537 7.445 0 0.000 47838 9.550 ms (last,min,max)
tmTot :49537 88.054 47209 82.589 43841 175.038 ms (last,min,max)
tm1Rcvr :49534 1874.628 40120 1862.127 39705 1968.039 ms (last,min,max)
voltsA voltsB curA curB temp
1.304 1.298 0.000 0.000 16K: 9.804 dwrP15: 0.000 ledHemtA: 0 rcv:11
1.023 0.992 0.000 0.000 70K: 0.000 dwrN15: 0.000 ledHemtB: 0 tm:49538
0.000 0.000 0.000 0.000 OMT:15.637 postP15: 0.000 lkShorDisp: 0
The tmxxx lines show when (ast seconds from midnite) and how long it
took for different operations. Each of these has the last,mintime, and
max time.
See if the communications is working:
- rlogin rfip1 (from observer2)
- ping "gpib0" .. you need the quotes
- This will start pinging the gpibenet device in the rfi box
upstairs. It
will continue running until you enter control-c. The output should look
like:
ping "gpib0"
PING gpib0 (192.160.175.10): 56 data bytes
64 bytes from gpib0 (192.160.175.10): icmp_seq=0. time=0. ms
64 bytes from gpib0 (192.160.175.10): icmp_seq=1. time=0. ms
64 bytes from gpib0 (192.160.175.10): icmp_seq=2. time=0. ms
64 bytes from gpib0 (192.160.175.10): icmp_seq=3. time=0. ms
64 bytes from gpib0 (192.160.175.10): icmp_seq=4. time=0. ms
ctrl-c
.. walkback from ctrl-c printed...
----gpib0 PING Statistics----
5 packets transmitted, 5 packets received, 0% packet loss
If the ping test failed can do a couple of things:
- get the dell laptop used for the tiedowns, and turret. This
should have
a name of platform 1 with the correct ip address (see correct
ip info).
- plug the fiber of the laptop into various parts of the fiber
chain and
then:
- rlogin rfip1
- ping "platform1"
You should get the same listing as the ping "gpib0" above. A good place
to start is the output of rep1 (port 5). You can then move up or down
the
path till you find where it starts working or not working
- Take the gpibenet and the xcvr down from the rfi box in the
dome and
use
that as your ping probe. You should used ping "gpib0" for the ping
command.
We have had some trouble with the hp34970 being powered off (or
sitting
in standby mode). It is plugged into a ups and should not lose power
(unless
someone turned off the ups accidentally...). Look in the rfi box in the
turret room and make sure that the hp34970 is on and the screen is in
remote.
You could just cycle the power and see what happens. You also
want
to make sure that someone had not inadvertently changed the gpib
address
of the hp34970. The gpib address should be: 11 decimal (0xb hex).
Run the platform ethernet statistics logger. This is a script
that does
an rsh rfip1 ifShow every N seconds and logs it to a disc file. It will
tell you if there have been any ethernet input/output/ or collisions vs
time (see how to run the platform enet monitor ).
Software: (top)
- rcvMNProg- program to control
and
read
dewar monitor. (top)
- The rcvMNProg runs on the rfip1 computer on vxWorks. It
controls the
hp34970,
configures the aomux, reads the data, and writes the data to the
disc file: /share/obs4/rcvm/rcvmN.
- The output datafile (rcvmN) grows to about 80 Mbytes per
month. At the
end of each month, rcvmN is moved to rcvmN.yymm and rcvmN is reset to 0
size (this is done by the end of month processing:
/home/phil/admin/monthproc.sc).
The program rcvMNProg should be stopped while these files are switched
(or it will continue to write to the archived file).
- The source code is in /home/phil/vw/datatk/rcvMon/rcvMNProg.c
.
- It can be compiled with make rcvMNProg in that directory.
- The object code to be loaded in vxWorks is compiled into the
directory
/home/online/vw/load . This file is loaded into the rfip1
computer
at boot time.
- /share/obs4/rcvm/rcvrsToLog.dat: This file is read
by
rcvMNProg
when it is started. It determines which dewars are monitored. The file
contains all dewars. Putting a # in column 1 will cause a dewar to not
be monitored (in case it has been removed).
- The rcvMNProg is started automatically when the rfip1
computer is
booted.
- You can stop and then restart the rcvMNProg from a computer
that can
access
the rfip1 computer. Be careful that you don't get more than one copy
running:
- rlogin rfip1
- rcvMNProgStop .. this will stop the program
- i .. this will list the programs running
- rcvMNProgStart .. this will start the program (make sure
the old
version
has exited).
- logout .. to exit the rfip1 computer.
- If you try to stop the rcvMNProg and it won't exit (maybe
because the
gpibenet
is hung up), you can try the following from the rfip1 prompt:
- rcvMNProgShutDown .. this will try and
close the file
descriptor used by the gpibenet.
- If the above doesn't work, you can try to manually close
the gpibenet
file
descriptor:
gpibEDbg .. this prints out the status of the gpibEnet
driver on vxWorks.
vw-> gpibEDbg
gpibEVerbose:0
num Use fd Role ibsta iberr
0 1 26 D 100 0
1 0 0 B 0 0
Look in the column Use and find the row that has a 1 in it. The
adjacent
col (fd) is the file descriptor for the hp34970. If you close this fd
(close,26)
this should shutdown the rcvMNProg
Dewar monitoring daily plots
(for
the web): (top)
- A cron script is run daily on megs (4:25 am) to create
the dewar
monitoring daily plots . The script is located at
/share/megs/phil/x101/dwtemp/dwtempdaily.sc.
The plots can be found at http://www.naic.edu/~phil . Scroll down to
monitoring
and click on dewar temperatures.
- dwtempdaily.sc starts idl and then runs dwtempdaily.pro to
create the
plots
for the previous day. The start/stop time for the script are logged in
dwtempdaily.log. The idl sesssion output is stored in
dwtempdailyidl.out.
If the plots are not updating, you should take a look at the .out file.
Monitoring the dewar
temperatures
in real
time: (top)
The dewar temperatures can be monitored in real time with (more
info):
- monrcvtemp (/usr/local/bin/monrcvtemp). It will bring up a
window with
the dewar temperatures for each receiver. It will update every 30
seconds
when a new round of data has been input. It is reading the file
/share/obs4/rcvm/rcvmN.
The routine needs to be run from a sun computer.
- monrcv (/usr/local/bin/monrcv). Contains the
monitored
voltages/currents
of the amps as well as the monrcvtemp temperatures. Needs to be run
from
a sun computer.
- monrcvpl (/usr/local/bin/monrcvpl). Plots the 16k dewar
temps for the
last
60 minutes and the last 24 hours. It runs as a strip chart updating the
values when new data becomes available. An idl program is started when
you enter monrcvpl. The command monrcvpl is currently only available on
the sun computers (/usr/local/bin). There is no reason why it can't run
on the linux machines.
The platform ethernet monitor
program: (top)
- The script /home/phil/vw/datatk/rcvMon/chkmon.sc is a script
that will
monitor the platform ethernet (used by the dewar monitoring program).
Every
N seconds it will send rsh rfip1 ifShow to the rfip1
computer.
This returns the current state of the IF interfaces. The ei interface
is
used for the platform ethernet. The script then sends this data to a
disc
file. You should edit the script and set delaysec to the number of
seconds
to delay between queries (default was 3600 =1 hour). The data is
written
to the file rcvMNProg.log in the directory where chkmon.sc is run. You
might want to rename the old logfile to something else prior to running
it.
- The idl script rcvmnprog.pro will plot out the i/o statistics
for the
data
output to rcvMNProg.log. To run it:
- go to the rcvMon/ directory
- idl : starts idl
- @phil & @geninit .. for initialization
- hard=0 --> no hardcopy, send to screen. hard=1 will
output to a .ps
file.
- .run rcvmnprog .. this will read the file and plot
the various
parameters.
The i/o rates are plotted with their median removed.
- You should see a constant ramp in the input/output. Pay
special
attention
to the i/o errors and conditions. They will all be plotted vs date.
- To kill the chkmon.sc just do a ctrl-c in the window where
you started
it.
History/Problems: (top)
- 13jun11:
- gpib to ethernet
and multimeter had been losing power. We replaced the UPS that gave
power to the rfi box in the rack. Was an APC, now a triplite.
- 11oct06: dewar monitoring slowed
down. Updating
once every 5 minutes. It finally died on 12oct06.
- ping "gpib0" from rfip1 failed.
- We brought the xcvr and gpibenet down to the control an hooked
it into
the fiber coming out of rep1 port5.
- The xcvr was a 10/100 transceiver. We notices that when the
xcvr lost
the
fiber input, the gpibenet would no longer sync up with the xcvr. The
10/100
link light on the gpibenet stayed off (it should be yellow when 10 mb).
- We replaced the xcvr with a 10mb xcvr and no longer had a
linkup
problem
between the gpibenet and the xcvr (even when the fiber was removed).
- We took this working system up to the rotary floor and
installed it. It
didn't work. When pinging the gpib0 we would occasionally see the tx
flash
on the gpibenet but packets would never get back to rfip1.
- We took the xcvr and gpibenet and connect them to the main
fiber
(c1.12,c1.13)
in the sband klystron room. ping worked fine.
- We moved to the input of rep2 (port11). Ping worked fine.
- We tried different output ports and ping failed on all of them
(we
didn't
switch the port11 input).
- Conclusion is that the rep2 is bad.
- We took a fiber barrel and jumpered the fibers rep2.port11
(inp)
to rep2.port12 (out) to bypass the repeater.
<-
page
up
home_~phil