rsync -av -e ssh firstname.lastname@example.org:/share/a2010-2/data/05.12.08 /home/yourdisk/idlraw/05.12.08Note: Most likely, Martha will already have run the calibration and bandpass subtraction routines already and will give you specific instructions on how to retrieve the new files from a machine at Cornell. Follow her instructions, not these.
cal050305.list for CALlist drift050305.list for drift scan list
R. Giovanelli, February 28, 2005
B. Kent, August 16, 2005
Numerous others since then.... :)
Martha, Jan 19,2009
0. Compiling procedures in your IDL session
You will need to load a number of IDL procedures to process the data. The
locations of these files should be defined in your .idlstartup file.
for setting up IDL at Arecibo.)
Load via the call:
The ALFALFA specific procedures (at Cornell stored in
/home/dorado3/galaxy/idl/idl_alfa) are loaded via the call:
If the idl_alfa directory is not in your path, you can load using:
When references are made to procedures without further explanation, they refer
to programs stored in the
idl_alfa directory. Those
programs (well, most of them...) have explanatory descriptions at the beginning,
regarding input and output parameters, keywords, and how they function. See
those for further understanding.
1. Conversion of FITS files to IDL .sav files
This step is usually done by the designated observer.
Raw FITS files are converted to a WAS Hdr type format based on correlator routines written by Phil Perillat at NAIC. The drifts are grouped into SAVE files of dimensions [npols, nrecs, nboards] - 2 by 600 by 8. The "d" structure contains the WAS header, a copy of the FITS header, position and setup information, and a data array of dimension N channels (currently 4096).
A FITS file consists of a 600 drift scan for 8 boards and two polarizations, as well as a Cal ON scan. Four IDLE SAV files are created from each FITS file.. 1) A drift file nnnnnnnnn.sav. 2) A Cal OF begin file - the first record from the drift to be used as part of a cal off. 3) A Cal OF end file - the last record from the drift to be used as part of a cal off. 4) A Cal ON file.
A list of all FITS files to be converted is created with complete path names to the files. The file list is named in the format filesyyyymmdd.txt. The program filecreator is then invoked in IDL:
IDL> filecreator, 'filesyyyymmdd.txt'
The process is automatic, and should be run on one of the faster machines at AO as soon as a night's observing run is complete.
2. Getting a session calibration solution from cal scans: Calib1This step is usually done by Martha before you take over (unless you are the designated observer and reducing the data yourself at Arecibo).
504507670CALOFend.sav 504508272CALON.sav 504508275CALOFbegin.sav
which come by "triplets", provided there is no breach of continuity in the data taking process. Each of those files contains a scan with a single 1-sec record; the format is the same as for the other drift scans. The scan name with the "CALON" extension corresponds to a scan in which the cal was fired for one second, at the end of each regular 600 sec drift scan. The one with the "CALOFend" extension is the last record of the drift preceding the firing of the cal, and that with the "CALOFbegin" extension is the first record of the drift immediately following the firing of the cal. An average of the spectra of CALOFend and CALOFbegin will constitute the cal OFF record.
Calib1 will generate two arrays of structures, named respectively dcalON and dcalOFF, which will contain the spectra of each of the calibration events. The format of these structures is identical to that of the drift structures. A "calibration event" is a set of cal ON, cal OFF records, in our case a triplet as listed above.
Calib1 is run on the data of a whole data taking session, or on subsets thereof in case some major change took place during that session which may have reset the scaling of the data, altered the cal values, changed the LO chain, etc. Then you'll run separately Calib1 for the data taken before and after the change.
In order to run Calib1, you'll need a listing of the files containing the cal events. In the directory containing the data of a data taking session, there should be a file containing all the names of the idl .sav files with "CAL" embedded in the name: cal051209a.list . This file is created when filecreator is run at the end of an observing session. You need to copy this file into your working directory and edit the file so that it is organized by triplets. For example, in a normal data taking session in which 5 drift scans were taken, you'll have something like:
504507670CALOFbegin.sav 504507670CALOFend.sav 504508272CALON.sav 504508275CALOFbegin.sav 504508275CALOFend.sav 504508878CALON.sav 504508881CALOFbegin.sav 504508881CALOFend.sav 504509482CALON.sav 504509486CALOFbegin.sav 504509486CALOFend.sav 504510088CALON.sav
You'll want to delete the first line, 504507670CALOFbegin.sav, for the Cal OFF record of the first drift is not a part of a triplet (although the program will run even if you leave it on, and ignore it). Similarly, you will want to add an OFF cal after the last line, so that the last Cal ON is in a complete triplet. Do so by repeating the last cal OFF record. The final CALlist file will look like:
504507670CALOFend.sav 504508272CALON.sav 504508275CALOFbegin.sav 504508275CALOFend.sav 504508878CALON.sav 504508881CALOFbegin.sav 504508881CALOFend.sav 504509482CALON.sav 504509486CALOFbegin.sav 504509486CALOFend.sav 504510088CALON.sav 504509486CALOFend.sav
You will need to be careful in preparing this file, if there was some breach of continuity in the data taking, e.g. aborted scans, drifts not followed by Cal ON scans, power readjustment, etc. Even if you have to sacrifice some Cal data, always restrict the list to good triplet sets only. It is always best to make sure that the data you are including in the calibration is OK.
You can now run Calib1.
Define the directory in which the idl .sav files are located, e.g.:
IDL> dir='/home/arecibo2/galaxy/idlraw/05.02.14a' IDL> CALlist='this is the file name of your CALlist' IDL> calib1,CALlist,dcalON,dcalOFF,dir=dir
Alternatively, have the names of the CAL files specified in CALlist with the
full address, directory and subdirectory from
Then you can ignore specifying "dir=dir" in the call to Calib1.
The procedure will silently load all the calibration data and return the structures dcalON and dcalOFF at the IDL prompt level. You can check that it worked:
IDL> help,dcalON,/stYou may want to save them in an IDL sav file for safety:
3. Getting a session calibration solution from cal scans: Calib2
This step is usually done by Martha before you take over (unless you are
the designated observer and reducing the data yourself at Arecibo).
Once you have dcalON and dcalOFF available, you can obtain runs of crat and Tsys through your observing session. As a reminder, crat is the value in K of 1 instrumental (spectrometer) unit. Calib2 produces a structure named ncalib:
IDL> calib2,dcalON,dcalOFF,ncalibYou will be asked
Any bad boards in system?Answer "n" if none.
You will be engaged in interactive approval of the calibration process as the program steps through each beam. Most of the time, you will OK each beam by hitting enter. A typical example is shown here. Oddities which require expert opinions include big spikes outside the flagged windows or double spectra.
Next the calratios (Cal/TPcalON-TPcalOFF) and Tsys will be displayed for the drift. Click for examples of a calratio fit and a Tsys plot. Note the amplitude of the variation in calratios. The calibration performed here is to the first order (it will be improved later in the reduction process.) The difference between points should be on the order of 1K, with δT /T on the order of 1/25. If OK, hit 'enter' to fit by default and then enter the order of the fit as makes sense; lower order is better, but if curvature is evident and warranted, then set N = 2 or 3. (Think here about the statistical significance of the fit. Does the data warrant fitting a curve really, or is N=1 just as statistically valid?) The other two options are to exclude data points or to restart the fitting process.
Note: If something unusual takes place during data taking in the given session, it may be necessary to break the session data into segments, each characterized by a different set of dcalON, dcalOFF and ncalib. You will notice that while running Calib2, as the values for each calibration event are separately displayed for each beam/pol configuration. If, for example, a "step" appears in the values of crat, you may want to process separately the calibration data before and after the step; hence two different ncalib structures will be necessary to properly calibrate the session's drifts.
Check that the ncalib file was created by typing:
IDL> help,ncalib,/stAdd a comment to the log file, providing the number of cals in the set and comment that
The structure ncalib is used in the calibration of the drift data. Save the ncalib structure for future use, e.g.:
IDL> save,ncalib,file='ncalib_050214a.sav'If you want, you can plot some of the values:
IDL> restore,'ncalib_05.02.14a.sav' IDL> plot,ncalib.rahr0,ncalib.decdeg0,psym=4
4. Bandpass Subtraction and Scaling, Calibration Monitoring and Position Recording
This step is usually done by Martha before you take over (unless you are
the designated observer and reducing the data yourself at Arecibo).
The next step in processing involves calling a procedure called bpd, which is a grab-bag of many operations: it reads the raw "d" structure of each drift, computes a bandpass, applies a baseline, extracts continuum data, scales spectra and continuum in K, monitors galactic HI for possible changes in the calibration records positions sampled in the sky, and stores output in a .sav file, besides leaving it for access at the idl command line level.
Read the descriptions of both procedures in the files bpd.pro and bpc.pro, to learn
the nitty-gritty of the processing. Procedure bpd
calls a number of other procedures, most notably
make_caldrift. It needs as inputs:
The output of bpd consists of:
The procedure bpdgui runs via a GUI interface.
4.1 Running bpdgui
You invoke the program as:
A graphical user interface will pop up and you will enter several parameters using the various widgets in the gui. Refer to this example.
First, click on the "browse" button in the upper right part of the gui and a widget will appear prompting you to select the IDL sav file containing the appropriate ncalib structure for the data you are about to process. Click on that file name and you'll see values for Tcals, median Crat and median Tsys filling the slots near the center of the gui. The Tsys values should be on the order of 30K. The plot in the lower part of the gui illustrates the mask function used for the measurement of total power for calibration purposes.
Second, click on the "browse" button in the upper left of the gui, and a widget will appear with the list of drift files. Select the list that is appropriate. It will take about 15-20 seconds to return control (the program is loading the first file from disk). A widget will appear in the upper right asking you to "OK" the list of N drift scans to be processed. The number of drift files should be the same as the number of cal files in Section 3.
Third, click on the appropriate button to decide whether the Calsession and Runpos structures are to be appended to or new ones started. You will have to enter the names of the files being created or appended to. Also it is important to make sure that CRAT is set to 'fit' NOT 'median'. Sample names include calsession_060318.sav and pos_060318_+083100s.sav. In the latter, the +083100 is the night's drift declination and the following "s" indicates this is a spring drift. If a night is spread across more than one drift, make multiple pos files with the different declinations indicated.
Fourth, you will have to set the "interpolation regions" in the bandpass creation process. Bandpass interpolation will involve principally (and perhaps only) the galactic HI line. BPDGUI has a default choice for the interpolation region (channels 3480 to 3520), which may suit most drift scans. However, it may be desirable to tighten the region, so that the bandpass will be more precise in the galactic HI spectral domain. Also galactic HI will shift around both frequency and channel number, during an observing session, and what fits a drift scan may not fit drift scans taken one or more hours later. You should examine about 3 spectra (beginning, middle, end). Refer to this example. For each spectrum, click on the file in the "raw files" window; after it loads click on the button that says "Interp region" and a set of options will appear, as well as a spectrum of the bandpass that you can interactively inspect. You may then select your own set of interpolation channel values, or select the default ones. Hit "zoom" to change the scale. Identify the channels where the galactic HI begins to rise, click "done" and go on to the others. Find the maximum range needed for your spectra. Then click on any one of the entries in the "Interp regions" box. The Interp Regions Edit widget will appear. Enter these new values (usually only 2) and then click "apply to all". The values in the display should update. Record the interpolation values in the log file. Note that this is a delicate part of the process, and you should be careful with it. Here is an unusual example, in which the interpolation window was enlarged to exclude M33!
OK, you're ready to BPD. Click on "process BP" and let it rip.
It should take about 7-10 seconds to precess each pol/beam configuration, i.e. about 2.5 min for a full drift.Often, when BPD finishes, there are some warning messages like
IDL> help,posThis should tell you that "pos" is array with dimension equal to the number of drifts that were processed. It is recommended that you exit IDL at this point and restart it (to clear memory). You may also want to make a copy of the pos file, just in case...
5. Inspection of the data: flagbb and reviewbb
|This is where you pick up if Martha has already run the calibration and bandpass subtraction. Notice that when you pick up the files, the directory contains, in addition to the drift data and the pos, ncalib and calsession IDL "sav" files, several anciallary files including a log file (e.g. "log_processing_YYMMDD") and a starter bad box file ("bb"). You should keep records in the log file, adding on to where Martha has left off. Be sure to erase from that file the instructions (leaving only the relevant information). Also you need to understand the steps that you did not have to perform, so that you know what has happened to your data up to now. If you are not already familiar with the process, check out the practice questions , especially the introduction to ALFALFA drift scan data.|
Once the drift data have been bandpass calibrated, you will visually inspect the data - ONE BEAM/POL AT A A TIME - and flag bad parts of each 2-D map. This is also the first time in the data processing flow that you get to take a close look at the data. You will thus use this module to: (i) inspect the data (ii) flag "bad boxes" (iii) enter comments in your reduction logfile. The coordinates of the regions of bad data in the 2-D map ("badboxes") are logged by the program in the "position" structure, previously created by module BPD (In the previous section we referred to it as "runpos"). The badbox information is contained in the pos structure as POS.BADBOX. BPDGUI inserts a default badbox, which you will edit using flagbb.
5.1.1 Calling flagbb
To begin, open the position sav file in IDL:
Next restore a reduced data .sav file produced by BPDGUI, e.g.:
The preceding command will load the dred structure, as well as a number of ancillary structures and arrays associated with that drift scan.
The inspection program is called flagbb. It is invoked with this general format:
IDL> flagbb,dred,cont_pt,cmask,pos,GAUSAV=gausav,HAN=han,AGC=agc,$ N1=n1,N2=n2$, RFIMOD=rfimod,MASK=mask
Your standard way of calling will be:
cont_ptinput file - a continuum map of nrec x nstr, or npol x nrec x nstr [e.g.
cont_pt], or nerg x np x nrec x nstr [e.g. mc]
|It should be noted that the AGC is a private database, developed and maintained by Martha Haynes and Riccardo Giovanelli. It contains considerable contributions from them, their friends and students, much of which is preliminary and/or unpublished. It is made available to members of the ALFALFA collaboration for use in undertaking ALFALFA science, but it is not a public database and should not be used for other purposes, copied or passed onto others without permission. Improvements to it are always welcome; if you find errors, send corrections to Martha.|
The following keywords are almost never used:
If the keyword AGC is set, the program will overplot on each image AGC galaxies in the vicinity of the drift. "Vicinity" is decided by the program on the basis of the size of the galaxy (if size is listed in the AGC). To do this, the program reads an ancillary file with all AGC gals within the Arecibo dec range. Information about the galaxies, including any previous HI flux detections, is printed to the screen.
Once called, flagbb will query the user about the intensity scaling (usually hit enter for default) and then ask what BADBOX information to use. The first time you invoke flagbb, choose (d) for default. The next sections describe the format of the BADBOX tag and how to refine the bad boxes.
5.1.2 Description of POS.BADBOX
The POS.BADBOX tag is an array of integers (100,2,8,4). It contains the coordinates of the corners of up to 100 boxes containing data that are deemed "bad", an especially useful tool when the data will be used for gridding and producing clean cubes. The first dimension of .BADBOX is the box number; the second is the pol number, the third is the beam number and the fourth contains 4 numbers which are LLch,LLrec,URch,URrec, i.e. the channel number and record number of the lower left (LL) corner of the box and the channel number and record number of the upper right (UR) corner of the box. Beam number 7 in this structure is a special value used internally that represents a potential box not necessarily active in any beam, but that has known dimensions and is ready to be added with 'a' or 'A'. The box numbers are defined for specific classes of bad data:
5.1.3 Description of flagbb display
When flagbb is invoked, it uses Phil Perillat's imgdisp procedure to display, one strip and pol at a time, position-frequency maps where the x-axis is channel number and the y-axis is record (sample) number along the strip. Click here for an example. In this screen, you will see:
The user actually works only on the position-frequency map. The continuum mask and averaged spectrum are displayed purely as references. The continuum mask follows a conservative approach and finds many more "bad" channels than there really are in the data. It is however useful to have it as a visual sanity check. The averaged spectrum can be used to help judge the significance of a feature and better determine the box edges.
5.1.4 Refining POS.BADBOX using flagbb
Default .BADBOXes (0-20) are indicated by red numbers along the top of the map in the flagbb image display. "Active" boxes are plotted as red rectangles and labeled under each rectangle along the bottom of the map. These default .BADBOXes extend over the full duration of a drift (i.e. they are BAD CHANNELS).
Since RFI features came and go from one day to the next, but persisting generally for the duration of an observing session, your first task is to refine the list of default .BADBOXes so that it will be close to correct for each of the scans in your observing session. In essence, you build a "memory" of .BADBOXES for your session, which should avoid having to reset them for each drift. In addition to the "default" .BADBOXES, the program allows for writing a badbox file, dimensioned [100,2,8,4], containing all the settings used for a given drift (see Section 5.1.2). At the end of processing a drift, the user has the option of saving the new set of .BADBOXes. At the beginning of the inspection of each drift, the user can (a) read in that file in the internal .BADBOX, then modify it at will or (b) read in the default .BADBOX file, specified inside the program itself.
After each pol/beam map is smoothed and shown, you can
You can now : - delete a box for this map only (d boxnr) - delete a box for this + all subsequent maps (D boxnr) - activate a box for this map only (a boxnr) - activate a box for this + all subsq maps (A boxnr) - modify a box for this map only (m boxnr) - modify a box for this + all subsequent maps (M boxnr) - insert a new box for this map only (i) - insert a box for this + all subsequent maps (I) - modify status flag for board or part thereof (s) - obtain a row Total and identify peaks (t) - undo all peaks found by "t" (u) - get an image of specified size (g) - HVC catalog maps (h) - go back to previous map (b) - exit & go to next map (e)
Description of Options:
?s You may reset the pos.status flag of this board Reset all records (a) or record range (r)? (def=a) : r Left click LOWER record nr... Left click UPPER record nr... You Chose: 70, 181, OK? : You can set the pos.status index as follows: 0 or 1 bm/pol board (or chosen section is used at full weight N board is used at 1/N weight [if N.ge.9 board is used at zero weight Enter a number : 2In this case, records 70-181 of this drift would be averaged with a weight of 1/2 during the gridding process.
|Simple flagging rules|
|0. Understand what you are doing. If this process were trivial, you wouldn't be doing it; we'd get our cats to do it.|
|1. Pay close attention! It is IMPERATIVE that you do a good job. A bad job is worse than no job and will earn you snickers (and we don't mean the candy bar!).|
|2. Be sure to use the "t" option.|
|3. Know how big to make your boxes; use the "m" key when you need to!|
|4. Know what is a continuum source; do not flag continuum sources.|
|5. Know what is galactic hydrogen; do not flag galactic hydrogen (or HVC's!)|
|6. Know when to set the status flag via the "s" key.|
|7. Save the pos file regularly; occasionally make a copy of it.|
5.1.5 Logging comments on data quality and detected galaxies
As you process each drift, you should log comments about the data quality and detected galaxies. The default log file contains example comments for a drift. Be aware of and log the following:
reviewbb is very similar to flagbb, except that it modifies nothing. It is just used to review the modifications made by flagbb. You call it via
6. Producing and examining a smooth version of the data: strip_pv and inspect
It is best to smooth the data before inspecting further, so you can see deeper into the noise. In order to produce a smooth display of a dred position-velocity (time-frequency) drift file, you will use strip_pv, which produces and displays a smoothed version of dred. You can then use inspect to display and inspect the data.
Before the smoothing step, you should check that you did not miss
any drifts in the session by insuring that box 0 is assigned in each one:
IDL> print,where(pos.badbox[0,0,0,2] eq 0)
If any number is returned, then that drift (remember to start from "0" when you count down the list) has not been
flagbb'd correctly. If you get "-1", then you are ready to smooth!
To run strip_pv:
Once you have msmooth, you can display the data and inspect it with inspect, which uses the display tool atv. Call inspect like this:
IDL> inspect, msmooth,showpol=3,nstrip=2, agc=1
where nstrip is the beam number and agc can be used to turn on the AGC catalog display. In this example, beam 3 will be displayed with polarization averaged.
IDL> print,where(pos.badbox[0,0,0,2] eq 0)If "-1" is returned, then you are ready to run the last check. If some other number appears, then you have not flagged the boxes in the drift corresponding to that entry in the pos file (remember that IDL starts counting from 0). Go back and fix it. Otherwise, proceed to check that all of your boxes are valid:
IDL> check_pos,posIf any problems are found, then go back and fix them. If, on the other hand, checkpos returns no errors, you are done! Notify Martha at this point. She will be very happy to hear from you.
'Features' at Arecibo (Dec '05/Jan '06):