PALFA virtual teleconference agenda

PALFA virtual teleconference agenda

Meeting summary from AAS (Ingrid)

Attending (probably missing a few people): Jim Cordes, Paulo Freire, Julia Deneva, Laura Kasian, Ingrid Stairs (scribe), Froney Crawford, Tim Hankins, Joe Lazio, Rick Jenet, Joeri van Leeuwen, Yosi Gelfand, Bryan Gaensler (briefly)

Issues for AUSAC

Lack of daytime observing is potentially a problem. Operator staffing is OK for a year at 24/7, but Jose and Norberto can potentially retire in a year. Paulo: would it be possible to hire another guard for operator safety backup during trips to the receiver cabin? Tim: this is probably only a problem during long S-band radar campaigns, such as the one that recently finished.

The 80/20 split, and what is a survey? Is the 80/20 split largely an artificialconstruct and can it be relaxed? Jim advocates changing "survey" to "large projects", and/or getting the observatory to allow any project as long as it doesn't drain the staff resources. Tim: there's a perception that PI projects drain resources and we need to fix that. It is especially NOT true for pulsar projects as opposed to many spectral-line, IDL-reduction-based projects. Overall, we need a better way to define compliance with the Senior Review. All of this plus the follow-up vs non-follow-up issue (eg, is 1906 still Palfa follow-up, or is it now a PI project?) will be raised at AUSAC also.

Painting

Tim: As of Jan. 8, the contract was not signed, and the contractor seemed unable to start before Feb 10, This might force the painting to run into rainy season, which can't be allowed. The contractor has another major job ongoing which they are trying to finish but it is not looking so good right now. But the situation could change from day to day.

Data Acquisition

The first installment of the spectrometer should be arriving soon. Jeff Mock is arriving at AO circa Jan 26. Dana Whitlow has arrived to work on LO downconversion for the spectrometer. Data rate will increase by a factor of 2 when we start using it ~midyear.

Tim: the palfa consortium could help relieve Arun from moving data around for shipping. This should be something that Angel or Tony could do. Arun also seems to be in the Fedex business, which is also not desirable. Action: bring this up at AUSAC, then Tim can talk to Arun about it.

Processing

Cornell code has been running on Windows for > 1 year. They've bought a linux cluster and are porting the code back, and will also run the presto code. Presto is running at McGill and UBC; no one knows about Columbia. We need to coordinate candidate classifications. We also need to get the candidate outputs into the database. Jim: this is the last-mile problem. CTC is more security-conscious than Arun, so this (communicating with Cornell database from outside) is non-trivial. It needs to be easy, robust but secure. Need tools will be needed to generate candidate lists, etc. Need a meeting at Cornell to do this face-to-face. Action item: Jim to schedule this, maybe in early March.

RFI excision code

Laura has found Parkes-MB-style rfi excision doesn't help in detecting strong pulsars like 0525+21, but still needs to compare outcomes on eg 1906 data. Julia working on a decimated dynamic spectrum time-f plane masking, similar to presto, and they will make a 7-beam version. That mask gets used in dedisp routine.

Feb. 1 deadline

Paulo: 1903 (the msp) is strong at the GBT; maybe we should simply propose to time it there? In terms of that and other proposals, we really need to know results of AUSAC to know what to submit.

Anticentre

Jim wants to keep it. Paulo points out that the disk handling is about to become the bottleneck, and the pulsar/Tb ratio is so much worse in the AC that it is reasonable to drop it if necessary. Jim agrees with this. Likely the issue can be delayed until we have the spectrometer up and running.

Surveys, coverage, scheduling, etc.:

Present scheduling: Jason

I've made a script to print out detailed statistics about the survey coverage thus far. This will allow us to quickly come up with these numbers again in the future. The attached plot summarizes the following results. As for future observations, our two choices are basically 1) fill in the remaining pointings needed to achieve the full, "dense" coverage at low latitudes or 2) achieve "sparse" coverage at increasingly higher latitudes. To keep in mind is that |b| < 1 was also targetted in the precursor survey, which is not included in this summary.

Inner Galaxy

3155 of 7066 total .N pointings (66.6 of 149.1 sq deg) 2106 of 7066 total .C pointings (44.4 of 149.1 sq deg) 138 of 7066 total .S pointings (2.9 of 149.1 sq deg) 5399 of 21198 total .NCS pointings (113.9 of 447.3 sq deg)

As a function of Galactic latitude

in the range |b| < 1

  • 1299 of 1413 total .N pointings (27.4 of 29.8 sq deg)
  • 933 of 1414 total .C pointings (19.7 of 29.8 sq deg)
  • 81 of 1413 total .S pointings (1.7 of 29.8 sq deg)
  • 2313 of 4240 total .NCS pointings (48.8 of 89.5 sq deg)

    in the range 1 < |b| < 2

  • 919 of 1414 total .N pointings (19.4 of 29.8 sq deg)
  • 604 of 1417 total .C pointings (12.7 of 29.9 sq deg)
  • 55 of 1408 total .S pointings (1.2 of 29.7 sq deg)
  • 1578 of 4239 total .NCS pointings (33.3 of 89.4 sq deg)

    in the range 2 < |b| < 3

  • 631 of 1414 total .N pointings (13.3 of 29.8 sq deg)
  • 562 of 1409 total .C pointings (11.9 of 29.7 sq deg)
  • 2 of 1413 total .S pointings (0.0 of 29.8 sq deg)
  • 1195 of 4236 total .NCS pointings (25.2 of 89.4 sq deg)

    in the range 3 < |b| < 4

  • 186 of 1418 total .N pointings (3.9 of 29.9 sq deg)
  • 7 of 1409 total .C pointings (0.1 of 29.7 sq deg)
  • 0 of 1414 total .S pointings (0.0 of 29.8 sq deg)
  • 193 of 4241 total .NCS pointings (4.1 of 89.5 sq deg)

    in the range 4 < |b| < 5

  • 119 of 1390 total .N pointings (2.5 of 29.3 sq deg)
  • 0 of 1417 total .C pointings (0.0 of 29.9 sq deg)
  • 0 of 1400 total .S pointings (0.0 of 29.5 sq deg)
  • 119 of 4207 total .NCS pointings (2.5 of 88.8 sq deg)

    Anticenter 1646 of 7037 total .N pointings (34.7 of 148.5 sq deg) 1957 of 7037 total .C pointings (41.3 of 148.5 sq deg) 115 of 7037 total .S pointings (2.4 of 148.5 sq deg) 3718 of 21111 total .NCS pointings (78.4 of 445.4 sq deg)

    ...as a function of Galactic latitude

    in the range |b| < 1

  • 818 of 1402 total .N pointings (17.3 of 29.6 sq deg)
  • 903 of 1411 total .C pointings (19.1 of 29.8 sq deg)
  • 47 of 1400 total .S pointings (1.0 of 29.5 sq deg)
  • 1768 of 4213 total .NCS pointings (37.3 of 88.9 sq deg)

    in the range 1 < |b| < 2

  • 463 of 1407 total .N pointings (9.8 of 29.7 sq deg)
  • 593 of 1403 total .C pointings (12.5 of 29.6 sq deg)
  • 56 of 1408 total .S pointings (1.2 of 29.7 sq deg)
  • 1112 of 4218 total .NCS pointings (23.5 of 89.0 sq deg)

    in the range 2 < |b| < 3

  • 261 of 1416 total .N pointings (5.5 of 29.9 sq deg)
  • 399 of 1400 total .C pointings (8.4 of 29.5 sq deg)
  • 2 of 1410 total .S pointings (0.0 of 29.8 sq deg)
  • 662 of 4226 total .NCS pointings (14.0 of 89.2 sq deg)

    in the range 3 < |b| < 4

  • 15 of 1402 total .N pointings (0.3 of 29.6 sq deg)
  • 51 of 1410 total .C pointings (1.1 of 29.8 sq deg)
  • 0 of 1410 total .S pointings (0.0 of 29.8 sq deg)
  • 66 of 4222 total .NCS pointings (1.4 of 89.1 sq deg)

    in the range 4 < |b| < 5

  • 88 of 1393 total .N pointings (1.9 of 29.4 sq deg)
  • 11 of 1413 total .C pointings (0.2 of 29.8 sq deg)
  • 10 of 1381 total .S pointings (0.2 of 29.1 sq deg)
  • 109 of 4187 total .NCS pointings (2.3 of 88.3 sq deg)

    Quick processing status: David C

    The quick processing has been running well using the new version of nodestatus. The added functionality seems to be helping people correct problems if they occur. However there is an on-going issue of some instablities with the ASP. The cause of this is unclear and the ASPers have been looking into it.

    Occasionally observers forget to look at the quick processing results. I have a script to check for this and I will occasionally generate a web page to list the relevant MJDs so we can fill in the gaps.

    Future coverage (what should we change?): Fernando

    I'd like to suggest that we suspend surveying in the anti-center (AC), at the very least until we're caught up with proper processing of all our existing data (inner-Galaxy [IG] and AC both), at which time we could revisit this decision (if we take it).

    It is relatively easy to be scheduled for time to search in the AC, but it is far from a cost-free endeavor: (a) we have to spend time collecting the data; (b) dealing with backing up/transporting the data adds considerably to the overall burden in this regard, which is already far from trivial (even before the uber-spectrometer comes into action). According to Jason's statistics sent around on January 17, the number of existing AC pointings is 40% of the total IG+AC (although in TB the fraction is smaller by a factor unknown to me due to the divergence of integration times in both regions some time back). Given that we're trying to find ways to reduce the impact of dealing with the huge data amounts (also in light of [c] below), for Arun and ourselves generally, suspension of AC searches would immediately have a significant positive impact. Finally, (c) most of these AC data that we've been collecting remain to be analyzed properly, along with those in the IG, and while discussing proper 'pipeline' data-reduction is not the point of this e-mail, let's not kid ourselves: it will be a huge task, and we don't need to make it harder by adding to the data backlog -- especially if the predictable return for our investments is relatively small.

    The above addresses some benefits of suspending AC searches (or costs of continuing). What are the benefits of continuing? Comparatively few in my opinion: (i) as we know, the density of pulsars is reduced in this area by comparison to the IG; (ii) our raw sensitivity increase compared to past 430MHz AO surveys is much reduced in this area vs the IG: Tsky at 430MHz is 40K at (l,b)=(180,0) and 123K at (50,0).

    Yes, there's always the chance that we could find a pulsar with a black-hole and 3 planets around it in the AC. But if that were a good-enough argument we should likely also be searching virtually the entire sky, at high latitudes. Surely as far as large-area surveys are concerned one plans and does the survey with a relatively good understanding that large numbers of objects will be found, while keeping in the back of our minds the hope that a small number of those will turn out to be special. To search (at enormous cost) a relatively barren area in the hope that the odd object out of the few will be special does not seem sensible to me.

    As as example, in the Parkes multibeam survey we did something akin to what I suggest: our starting longitude range was 220-360deg; after a few months we realized that we were finding very, very few pulsars in the 220-260 range compared to 260-360, and of course that we were spending nearly 30% of the effort in that barren area. So we dumped it, and moved on to more productive work -- it's not as if we weren't busy enough otherwise.

    So that's my suggestion. Specifically, we could simply ask Hector to not schedule any more AC runs beyond the ones already on the schedule. One day, when we're more ready, and if we decide that resuming those searches would be more cost effective, we could ask him to resume scheduling them (or we could resubmit a proposal for that, if relevant).

    Proposals for the Feb 1 deadline: Fernando/Jim

    Jim: I don't see a strong need for any as long as we're getting enough time for timing.

    Status of the new spectrometer: Paulo/Jim

    Jim: the new spectrometer will probably be shipped to AO early this year and will need break-in and integration of the new data format (and size: 4x more channels!) into the processing codes. I've been preparing the CTC folks about this for a year so it won't be a surprise to them vis a vis impact on archival and processing. We will need a few volunteers for implementation at Arecibo.

    Paulo: the hardware is now on-site (as of Jan 15, 2007). Jeff Mock will arrive on Jan 18, after that the task of testing the spectrometers and connecting them to the IF/LO system and to the local computer network will start. This will go on during the painting job.

    Follow-up of new sources

    Timing binary MSP J1903+0327: David C

    We have continued to grab the occasional pointing on 1903 during inner Galaxy survey observations. Thanks to Scott, Jason, Ingrid and Fernando we have been able to get a few pointings with the GBT over recent weeks.

    We now have enough data to generate a PRELIMINARY orbital ephemeris: Pb 95.16 days, Ecc 0.44, A1 106 lts As you can see from the image here the orbit is poorly sampled at periastron. Until we get data from that part of the orbit (at MJD 54160-54165 in 49-54 days) the parameters will not be well defined.

    However, assuming these values and a pulsar mass of 1.4 Msun we get a minimum companion mass of 0.9 Msun. It would also be the most (only?) eccentric GP MSP. It does not appear to be associated with any clusters.

    Timing young binary J1906+0746: Ingrid/Laura K

    Young pulsar J1856+0245: Jason

    During the holidays, I received a referee report for the J1856+0245 XMM proposal, which was requesting ~70 ks in order to perform imaging and spectroscopy on the ASCA source that is coincident with the pulsar position. Apparently the referees we got are less convinced that there is such an association:

    Interesting pulsar. However, the evidence of an associated diffuse hard X-ray source is questionable. In their list, Sugizaki et al. (2001) give a detection significance of 4.3 in the 2-10 keV band, when cutoff limit of their list is 4.0 and they mention to have found a large number of fake sources (>20%) also above that limit. Also from Fig.1 the presence of an extended source is unclear. A more detailed discussion of this would have been appreciated. The panel decides to assign this target with C priority.

    The proposal received a grade of "C", which means it is possible that it will be scheduled, but unlikely. For obvious reasons, this is very disapointing. This also brings me back to the issue of a J1856+0245 paper. I'm concerned that the discovery and timing of a single Vela-like pulsar without significant X-ray follow up won't make for much of a paper. I plan to resubmit this proposal again if we don't get scheduled.

    Timing new slow pulsars at Arecibo: David N

    Slow pulsar timing (P2177)

    Thirty-three PALFA slow pulsars are being being successfully timed at Arecibo (project code P2177). The data spans range from one month, for the most recent discoveries, to ten months, for pulsars which have been known the longest. Each data set is fully phase connected.

    In principle, TOAs could be derived from the original discovery and confirmation data for each pulsar. Because of data logistics issues, I have not attempted to do this.

    If you like residual plots, check out:

    http://www.naic.edu/~nice/palfa/resplot1.png

    http://www.naic.edu/~nice/palfa/resplot2.png

    http://www.naic.edu/~nice/palfa/resplot3.png

    http://www.naic.edu/~nice/palfa/resplot4.png

    Twenty-four of the pulsars have been observed long enough to decouple their period derivative from position. A p-pdot diagram is at:

    http://www.naic.edu/~nice/palfa/ppdot.png

    Histograms of age, magnetic field, p, and p-dot are at:

    http://www.naic.edu/~nice/palfa/stat.png

    The PALFA pulsars as a group are a little younger than the population as a whole (smaller P, larger P-dot), but not by very much.

    Around two dozen of the pulsars??the older discoveries??have been folded at 2, 3, and 5 times their nominal periods in order to check that we are observing the fundamental period. Doing this for the newer discoveries is on the "real soon now" list.

    One of the thirty-three pulsars is the relativistic binary 1906+0746. I have not done an in-depth analysis of this pulsar, as there are better data sets from special-purpose observations made by Ingrid and company.

    Update to the list of new PALFA pulsars

    As of this morning, there were thirty-nine pulsars listed on the "new pulsars" page of the PALFA website. Thirty-three are being timed under project P2177. What are (were) the other six pulsars on the list?

  • 1903+03, the first PALFA MSP is being timed by McGill (with data taken during P2177 and at other times).
  • 1928+15 and 1937+22 are RRATs. See section 3, below.
  • 1901+0621 was discovered by the Parkes multibeam survey (astro-ph/0607640), but was inadvertently listed as a new discovery in PALFA paper I. I have removed it from the "new pulsars" page.
  • 1949+23 was listed twice as a newly discovered pulsar. I removed the second entry.
  • 1935+26 is an unconfirmed candidate with s/n=7.0, period of 5 seconds, and negligible dispersion. As far as I know, it has not been confirmed, and it does not seem to me to be a strong candidate. (I hope I am wrong.) I have removed it from the "new pulsars" list and put it into a new list called "candidates" at the bottom of the "new pulsars" webpage.

    In addition to making the above three deletions from the PALFA webpage, I also corrected three pulsars for which the listed periods were incorrect (high harmonics of the true periods).

    Note on RRATs

    PSRs 1928+15 and 1937+22 were discovered in single pulses and (as far as I know) have not been confirmed. (If they have been confirmed, could someone please add confirmation scans to the palfa webpage!) I have collected a lot of data on 1928+15 (/proj/p2177/*J1928+15*), but these data have not been examined in a systematic way. I don't even know if there are any visible pulses. I have not collected any data on 1937+22. Working through the 1928+15 data would be a great project for a volunteer single-pulse expert. I can supply detailed lists of scan numbers and python code for looping through all the data files. I'll try to get a couple pointings on 1937+22 in upcoming runs.

    Timing at Jodrell: Michael/Andrew

    Below is a list that gives you a feeling for the status of Lovell timing for all newly discovered ALFA pulsars. You will find a dash '-' for those pulsars, which have been discovered only very recently or which are too week for us to see on a reasonable timescale.

    We list the number of days of observations which is equal to the length of phase coherence for all but 1953+27. We currently try to solve it and we should have sufficient data to do so shortly. If someone has already a solution that we are not aware of, please let us know!

    We also quote the derived age and magnetic field. If the timespan covered by the TOAs is still too small to trust the results, we put it in brackets. In total, we currently follow 18 pulsars, some of which are at the borderline of detection given the available integration time.

    What is not clear to us is which pulsars are exactly timed at Arecibo and at other places. It seems unreasonable to spend observing time at different telescopes on the same sources, so we suggest a distribution of the sources among the telescopes. That would also make it clear which telescope is responsible for which pulsars, avoiding unpleasant surprises that some pulsars may not be timed at all, and in particular, saving valuable observing time.

    The coordination should clearly done by the timing/follow-up committees, which haven't been very active as such since.

    0540+32 560 19My 0.5TG
    0628+09 -
    1850+04 -
    1853+03 -
    1855+02 (480 61Myr 0.1TG) not all detections seem to be real
    1856+02 235days (21ky 2.3TG)
    1858+03 detected?
    1903+03 -
    1904+07 550 8My 0.3TG
    1905+08 -
    1906+0746 600 112ky 1.8TG Erratic last 100days. Profile changes.
    1909+07 -
    1916+12 some clear detections, many non-detections.
    1919+13 470 2.6Myr 1.5TG
    1921+08 -
    1924+16 -
    1928+1746 550 85ky 1.0TG
    1934+24 -
    1935+26 -
    1937+20 -
    1940+23 -
    1941+25 100days
    1946+25 480 1.5Myr 1.7TG
    1948+25 230days (344ky 1.3TG)
    1949+23 100 (36Myr 0.9TG)
    1953+27 yet to be solved!
    2005+35 -
    2006+31 480days 104ky 2.1TG
    2007+31 230days (619ky 3.1TG)
    2009+33 -
    2010+28 480 100My 0.2TG
    2010+32 -
    2011+33 -
    2013+31 -
    2018+34 540 3.3My 0.9TG

    Is Westerbork doing any followup?: Ben

    Polarimetry: Ramesh, Han

    Observing Strategy

    This has been discussed and summarised in multiple emails before. In short, we also record the data on a good polarimetry calibrator such as B1929+10 (IG) or B0611+22 (AC) as part of p2177 runs. A short cal scan (switching 25 Hz) is recorded after every pulsar scan. In addition, once every few days, we also record data on a well-known flux calibrator such as B1939+103 (IG) or B0640+233 (AC) in a on/off N-S-E-W sequence to enable flux calibration of the data. As David mentioned in earlier emails, these procedures are now nicely automated in CIMA, and the processing software assorts the various scans based on their filename tags and the position information extracted from the header.

    Processing Pipeline

    The polarimetry reduction code "wappcorpol" that was originally developed (by Jim et al) for the early wapp data was tailored for p2177 data format and heavily automated so that the entire reduction can be done in just a few steps. The processing scripts expect 2-3 command-line args. I will put together some "howto" documentation soon so that anyone can process the data on the AO machines. In brief, the raw correlations are first folded in the correlation domain and then converted to stokes, producing plots of the profiles and the PA swing. The code also fits for RM (on the PA sweep across the channels) as part of the standard reduction. At the moment, no correction is applied for the instrumental cross-couplings (important for absolute polarimetry) or no RFI editing features are available.

    Status of Processing

    I have processed a fair chunk of data taken between March and Sept 06 and the attached table (and the example plots) gives a quick summary of the analysis so far. Thus one of the immediate action items would be to bring the processing up-to-date. The processing is currently done on the AO machines, and is much easier (and faster) while the data files are on the /share/wapp disks (so ideally it's desirable to process the data within a few days of obs). Once the data are moved to /proj/p2177 area, it becomes a bit tedious (and slower), but this can be remedied by suitably altering the processing scripts (todo).

    Summary of the Analysis

    The appended table lists the data sets processed/examined so far, and some quick notes on the level of polarisation, and some first-order estimates of RMs and fluxes. I have also posted some example plots (profiles and RM fits) at http://www.naic.edu/~rbhat/palfa. Some pulsars (those with empty fields) were processed either with incorrect (or old) par files but this can now be remedied by reprocessing them using the latest and greatest par files that David has posted on the web (todo). s/n wasn't good enough in some cases and some appear virtually unpolarised. However this assessment is largely based on single 5-minute scans and so s/n can be potentially improved in several cases by combining the data from multiple scans or epochs. Thus some of the related action items are: a) bring the processing up-to-date, b) reprocess the ones that had par file problems/error, c) implementation of some RFI editing, d) combine the data from multple scans or epochs to improve the s/n, and e) correction for cross-coupling.

    Notes from the preliminary analysis (IG pulsars)

    PSR P(s) DM (pc/cc) polarised? RM Spk Sav
    J1855+02 0.41587388 504.0000
    J1856+02 0.08090854 622.3060 ? 164 6 0.5
    J1858+03 0.25686528 387.1450
    J1903+03
    J1904+07
    J1905+08
    J1906+07460.14408413 217.5000
    J1909+07 0.23716867 538.9070weakly?206 5 0.1
    J1916+12 0.22740318 265.1480highly797 8 0.1
    J1919+13
    J1921+08
    J1928+15
    J1928+1746
    J1937+20 0.68712453 327.9740
    J1940+23 0.54686203 252.3430highly 305
    J1941+25 2.30626875 314.4030medium ?70 11 0.3
    J1946+25 0.51519485 248.6030medium 83 25 1
    J1948+25 0.19663673 289.1120medium 29 12 0.7
    J1949+23 1.31944905 196.4680weakly n/a
    J1953+27 1.33403330 194.1670hardly ?73 4 0.2
    J2006+31 0.16370087 107.0480highly ?66 13 0.2
    J2007+31 0.60823083 191.2580 4 0.1
    J2009+33 1.43842707 263.6700weakly n/a 4 0.1
    J2010+28 0.56539574 112.3970medium n/a 14 0.6
    J2018+34 0.38767853 222.1640medium n/a

    Other observations

    Han: We (I with Bryan Jocoby and Willem van Straten) have an accepted GBT proposal (Group A) for polarimetry and rotation measures of pulsars Probably we are able to spend some time on some bright ALFA pulsars for our purpose if you do not object. More over, Ingrid and Don agreed to use their backend to take data at the same time. If we tough ALFA pulsars, we will donate all data out (I suppose they can be used for timing or other purpose).

    Other follow-up (e.g., X-ray): ???

    Joe Lazio: One of the other kinds of followup is astrometry. That's largely been Shami and I, I think. However, it is very much "need driven," and I haven't seen many comments about pulsars needing astrometric followup so far.

    Full-resolution data pipeline

    Status of code: Scott

    Executive Summary

    The "PRESTO" pipeline is mostly working and is (or has been) processing data at McGill (mostly by David Champion, Patrick Lazarus, and Jason Hessels) and a little bit at UBC (mostly by Joeri van Leeuwen).

    The "mostly" comes from the fact that currently the pipeline starts from raw data and gets to the point of generating a list of filtered candidates and "standard" PRESTO prepfold plots (and Maura/Jim-like single-pulse plots as well). What doesn't yet happen is getting those candidates and their plots into a consortium-accessible database. This means that we currently have no way for consortium members to easily and rapidly view candidates.

    This is a very big problem. We need to solve it quickly. I've written fairly advanced filtering algorithms to reduce the number of candidates from each pointing from several hundred to several tens, but we need real eyes to look at those several tens. I do not think that VNC+ghostview is a viable solution. We need to be able to look at ~10 candidates at a quick glance -- much like in the quicklook processing, except where the candidates are much better in quality (due to the quite extensive RFI excision that we do).

    Background on the pipeline

    I wrote the core pipeline in Python. It uses "numpy", "scipy" (see www.scipy.org for information on these _excellent_ libraries) and many of the Python routines in PRESTO itself. David, Jason, and Patrick wrote several scripts that handle "automated" starting of the processing using the cluster batch-system OpenPBS (or Torque as it is now called).

    Basically what happens is that the scripts push a raw data file (a SIGPROC-style .fil file for a single beam) and the pipeline code to a single processor and start the pipeline. The pipeline calls standard PRESTO routines for the processing. A single anti-center beam takes ~12 hours to process on a 2GHz Opteron and a single inner-Galaxy pointing takes ~24 hours to process. The steps and the approximate percentage of processing time are as follows:

    rfifind (time-freq mask of raw data): <1%
    subbanding (initial dedisp stage): 3%
    dedispersing: 1%
    single-pulse search: 6%
    FFTing: 1%
    no-acceleration search (1,2,4,8,16 harms): 14%
    mid-acceleration search (1,2,4,8 harms): 71%
    candidate sifting/filtering: <1%
    folding of subbands: 2%

    As you can see, we are completely dominated by the acceleration searching, and therefore, it doesn't make sense to optimize any of the other parts of the code for speed. (In reality, we are dominated by the pre- and post-processing -- i.e. getting people to feed data to the clusters and getting people to look at candidates).

    The "mid-acceleration" search will search for candidates where the _highest_ harmonic drifts by up to 50 Fourier bins. (See Ransom, Eikenberry, and Middleditch, 2002, for a description of the Fourier-domain acceleration search technique)

    I should note that the pipeline is essentially a specialized version of what we've been doing for the globular cluster searches with Arecibo and GBT (were we've found almost 70 MSPs). In general, things are quite well tested (thanks to exensive input from Jason, Ingrid, Fernando, and others).

    RFI Excision

    Since Jim briefly mentioned how the RFI excision works (or should work), I'll briefly go over how I have it working in the pipeline now.

  • rfifind generates a time/frequency mask from the raw data. This is done on a per channel basis every ~2.1 sec in time. In general, the mask suggests that <10% of the data be thrown away. This stage removes short transient bursts and strong narrow-band RFI.

  • During de-dispersion (actually, subbanding), the DM=0 timeseries is examined for large transient pulses. These values in each of the channels are zeroed out. This removes strong terrestrial impulsive signals.

  • After FFTing, a series of known birdies are zapped in the power spectra. The current list is:
    Freq (Hz) Width (Hz) Numharm Increase_Width
    0.07618684 0.003 90 0
    0.08317989 0.004 120 0
    1.2 0.002 6 0
    60 0.1 5 1
    (Increase_Width is a flag that means that each harmonic gets progressively wider) This stage removes broadband periodic signals.

  • During the initial candidate filtering stage, candidates that peak in S/N at DM=0.0 are removed.

  • Finally, once we get a proper candidate database up and running, comparing various beams from the same pointing and/or various times from the same beams will allow us to eliminate candidates from the viewing process. This is the only part of the excision process that is not yet working.

    Problems and To-Do list:

    Currently over a thousand (at least) beams have been processed by the pipeline. Joeri reported that many of the beams came back with no candidates (which means that there is a problem somewhere). And as for the McGill processing, there are candidates, but I have no idea how many there are or what they look like. Hopefully David or Patrick can fill us in on that.

    The immediate to-do list includes (in priority order):

    1. Checking the quality of the candidates that have already been processed. This will demand fixing the error that plagued the UBC processing, and likely tweaking the candidate filtering algorithms and/or the birdie list(s).

    2. Making a _simple_ and _fast_ way to view all (or at least the top 'N') candidates from the pipeline for a beam. I think a relatively easy way to do this would be to convert the prepfold files into low-res .png images (maybe 640x480 or so) and dynamically load them into a web-based tool. The prepfold files have most all of the diagnostics that you need to decide if you have a "good" candidate or not. This way we can view all of the filtered candidates from the pipeline, which I think is important in the early stages for "calibrating" our eyes. Later on, we can concentrate on "Reaper-style" viewing. It would be good if the web-tool would:

  • display the top 'N' candidates from a beam
  • note which candidates have been viewed in a database (and by whom)
  • allow you to sort by period, S/N, and DM
  • allow you to compare with the other beams from the same pointing (perhaps at first simply by showing a text-based candidate list sorted by period/freq for all the beams)
  • show the overall single-pulse plot So, in general, this would be very similar to the quicklook tool, except with the standard prepfold output files in a low-res form (clicking on one could bring up a higher res version).

    3. Getting the pipeline up and running at other sites. Those that I currently know about with lots of CPUs are WVU and Cornell. NRAO and Columbia can help with limited processing as well (and maybe Arecibo also).

    I will be going to McGill in mid Feb for a couple days to work on some of the above issues. Volunteers to help in any of the above (and/or simply looking through candidates) would be greatly appreciated...

    Processing at McGill: David C

    Following a major upgrade to the PRESTO pipeline we updated the support scripts. The latest version can be found here

    http://www.physics.mcgill.ca/~champion/scripts070115.tar.gz

    These scripts now collect all the information required for the CTC database. Due to these major changes we have been reprocessing data we already have. Thus far 1085 beams have been processed using the latest version. Several cluster related issues have slowed our progress but processing is back in full swing.

    We have 14 new drivers to add to the data transport system.

    Processing at Cornell: Jim

    Full resolution processing has been going on at Cornell since the beginning of 2005, though with significant tweaking of the code. It runs both the periodicity search and, as Maura pointed out in her summary of single pulse searches, two kinds of single-pulse analysis (one a quasi-matched filter approach, the other a friends-of-friends approach). Data products go into a SQL Server database and are accessible through a GUI. We have set up a web site at the CTC for the PALFA project that has general information, links to the AO site, and proprietary information (e.g. the processing results). The proprietary results are password protected. Anyone in the consortium desiring access can email Manuel Calimlim (Manuel Calimlim ) to get access.

    One of our regular data products is a pair of seven-beam plots (similar to the one in the first PALFA paper). These range from the ridiculous (intense RFI) to the sublime (very beautiful cases of a pulsar appearing in one of the beams, with clean, noise-like events in the other beams, or fairly awful RFI in all the beams with a clear pulsar signal in one of them).

    We have processed data for 3257 different pointing directions, including many multiple pointings on known pulsars. I'll have to get numbers on how many pointings have been transferred to the CTC from AO to compare with what has been processed.

    Status of the Cornell Code:

    The Cornell code is a collection of programs run with python scripts. It uses sigproc/filterbank, decimate, reader, header at the frontend and a collection of C and Fortran programs going further downstream. The code was originally written in a linux environment, then was ported to the 64bit Windows Unisys cluster at the CTC, with considerable changes made. We are now porting the code back to linux, which is mostly straight forward. We have not to date made this code available publicly, but after the re-port back to linux, we will be happy to make it available to other PALFA Consortium members.

    Future Processing at Cornell:

    We will continue to process on the Windows/Unisys cluster. With the re-port of the code back to linux, we will process also on a new linux cluster (dual-core dual processor opterons; we're dedicating 7 nodes to this). The plan is to import data into the linux cluster via transfer from the CTC's tape archive of the raw data.

    Further Evolution of the Code:

    It is clear that we need to do a better job at RFI excision. Our group here (Julia Deneva, Adam Brazier, Chunglee Kim, John Zollweg, Manuel Calimlim, and myself) looked at this in an end-to-end way and identified several places in the pipeline where improvements are needed (and acknowledge that others have thought of this or have developed code too):

  • at the post-filterbank, pre-dedispersion stage: masking strong RFI in the f-t plane using a mask constructed from all 7 beams of a pointing; currently we construct the mask using f-t data with full frequency resolution but time resolution degraded to 1 sec. We will probably change this to 0.1 sec.
  • at the post-dedispersion stage: identifying strong events for the DM=0 channel as likely RFI events.
  • removing candidates from the periodicity candidate list that are known birdies that we put into a zap list
  • removing events from the single-pulse analysis, such as those that appear in more than 3 beams or that are from the 11-12 sec period radar.
  • removing periodicity candidates that make it through previous filters but which appear in too many independnt sky positions (this is done at the database level)

    Implementing these steps in the Cornell code and through database tools is ongoing. We expect that RFI-excision algorithms will be a continual development effort and that code developed can be used in any of the codes used to process data.

    Database, uploads, candidate selection, and a face to face meeting:

    The db at the CTC includes all the data products from our processing, along with header information for the original data files.

    The db will eventually allow network access to raw data, but network bandwidth and transport costs have not been dealt with yet to allow this. Currently the CTC is on a 10GB campus backbone and the CTC can access the National Lambda Rail. But things are evolving slowly; we are "almost" connected in the Space Sciences Building (where Julia, Chunglee, Adam and I sit) to the 10GBE backbone, but not quite.

    In July 2006 we had a small meeting to discuss how data products from the presto code could be imported to the CTC and inserted into the SQL Server db. We have not settled on this yet, but it is a matter of getting agreement on method (taking into account security issues at the CTC, ease of use, etc.). This is not a major thing, but we certainly need a test case (McGill, UBC?) for uploading data products. From a security point of view at the CTC, it is preferable to upload to a public area (e.g. upload a MySQL db or upload flat ascii files) and then run code locally that inserts the results into the SQL Server db. I would like to call a meeting as soon as is feasible (March, April 2007) to settle this. At the same meeting, we can also discuss a point that Ingrid raised at the AAS meeting: that we need to discuss and choose protocols for candidate selection.

    Tools for filtering and visualizing database results:

    Manuel Calimlim has set up a GUI so that routine SQL commands can be run on the db.

    Adam Brazier, working with Chris Pelkie at the CTC, has helped implement a visualization tool for periodicity candidates that is based on OpenDX (www.opendx.org). Right now it can access data from the db in a dynamic way. Adam can describe this in better detail (though he is distracted right now getting our linux cluster up and running). The intention is to make the OpenDX code available. Also, we will develop a version for the single-pulse candidates.

    Processing elsewhere:

    Dunc: At WVU, we now have a 32-node cluster, a disk enclosure and some PALFA data. Dave Champion is installing the PRESTO-based scripts and we hope to get started with some processing very soon.

    Froney: At F&M a new cluster is being bought this spring and will be available to process PALFA data.

    Data transport

    Data transport from AO to CTC and beyond:

    Data are transported from AO to the CTC via IDE disks (serial ATA) in transport trays from Granite Digital. They are read in and archived onto the TSM tape archive system. The disk drives mostly have then been shipped back to AO, though recently, with the ramp up of processing at other sites, the disks have been sent to those sites. Naturally, the disks need to be sent back to AO as soon as possible.

    Data integrity has been monitored by looking at file lengths as a simplistic method and more rigorously by using MD5 checksums, which are calculated at several stages in the data transport.

    Bottlenecks: data shipping was slowed down by the requirement that we keep a valid copy at AO at least until we knew the data had been archived and validated at the CTC. We had a long series of problems with the Granite Digital jbod boxes that appear to have been due largely to differences in firewire interfaces. These problems are now resolved. Arun for a while had to use many of the portable disks as the AO copy of the data, which impacted shipping of data. That has now been resolved by Arun's acquisition of LTO3 tape drives that he will use to make the AO copy, thus freeing up many of the disks. Also, we purchased another set of 10x750GB drives. Additional contributions to the drive pool are needed; it is best to contact Arun directly to ask him as to how many, what size disk, and when to provide them.

    Adam Brazier has developed a tracking db that he has just emailed us about; it is not yet done but soon will be. Earlier, we have created flat files that are directories of disks that have been shipped.

    Shipping of data to new processing sites:

    The easiest scheme is, as above, to ship to you the disks received at the CTC after they are archived. Cornell/NAIC will pay the shipping cost to you; you are responsible for shipping the disks back to AO. Please let us know.

    Eventually, I would like to see network transport to those sites that are on the NLR or equivalent. To accompish this, significant experimentation is needed, as Joeri can attest to.

    Single-pulse searching/followup: Maura

    Here is my contribution to the virtual telecon on single-pulse searching and followup. I have divided this writeup into three sections. The first is on the quicklook candidates and followups, the second on the code for full-processing single-pulse searching and the third on fast-folding algorithms (not really single-pulse searching but related). All the single-pulse plots can be found here

    Quicklook candidates and followups

    So far, there are 7 single-pulse search discoveries listed as either candidates (at www.naic.edu/~palfa/candidates ) or new pulsars (at www.naic.edu/~palfa/newpulsars ). Julia tells me that there are some other Class C candidates that have not made it into these lists, but that they are not very convincing. Here is the list of detections, with discovery observation name, Jname, number of times the candidate has been reobserved (to my knowledge!), number of total pulses detected, approximate DM and period (if it can be calculated from the single pulses). There is also a note for each candidate.

    Discovery obs Name # reobs Confirmed? # pulses DM (pm cm-3) P (s) Notes
    G202.12-00.82_53254_0112 J0628+09 lots yes lots 88 1.24 1
    G37.97-00.05.S._53771_0010 J1900+04 1 yes 3 340 2.6 2
    G50.64-00.97.N_53678_0025 J1928+15 10 no 2 240 0.405 3
    G57.86+00.49.C_53846_0081 J1937+22 0 no 2 90 0.165 4
    G49.88-00.28.S_53771_0037 J1924+14 0 no 1 600 ? 5
    G56.01+01.54.N_53468_0089 J1929+21 1 no 13 60 0.723 6
    G57.81+00.23.N_53469_0119 J1938+22 0 no 1 90 ? 7

    1) Being timed successfully by David N et al. Pdot of 5.7e-16 implies normalish B of 9e11 G and age of 34 Myr. David reports that about 40% of 30-s integrations give good TOAs. Presumably there are no pulses in the other 60%.

    2) This is still listed in the "candidates" list on the website and was even reobserved and confirmed. However, these pulses (1 in first epoch and 2 in second) are from known 2.6-s pulsar J1901+0413. This must be taken off the candidates list and added to the redetections list (someone please tell me protocol for doing this). Note that, interestingly, this pulsar does not show up in the periodicity search down to S/Ns of 5.

    3) Two pulses separated by 405 ms detected. Pulsar has been reobserved a number of times but only one reobs has been looked at (by Julia) with no re-detection. There are more reobservations that Froney and I and Froney's student are going inspect shortly to see if we can detect more pulses.

    4) Four pulses detected at same DM in original observation, but this pointing has never been reobserved. The period derived from these four pulses is 165 ms. This is unfortunately J1938+22 from the DMB survey and should be added to the redetections list and taken off of the candidates list. This pulsar does show up as a candidate in the FFT search, but at a signal-to-noise of only 6.

    5) Just one very high DM pulse - never reobserved. Should be reobserved!

    6) 13 pulses detected with period calculated from SP search of 723 ms. This is unfortunately J1929+2125 from the DMB survey and added to redetections. This doesn't show up in the FFT search down to S/Ns of 5.

    7) One very bright pulse detected in beams 3 and 4. This is also pulsar J1938+22 from the DMB survey, however, and should be removed from the candidates list and added to redetections. This pulsar is not detected in either beam in the FFT search down to S/Ns of 5.

    SO, in short we have one definite new pulsar from the sp searching (J0628+09). We have two good candidates (J1928+15 and J1924+14) one of which (J1928+14) has been reobserved many times and one of which (J1924+14) awaits confirmation. We have four objects which are listed as single-pulse candidates but upon further inspection are actually known pulsars.

    It is still not clear to me exactly how candidates get from the PHP viewer stage into the candidates list. And how things get moved around from candidates to redetections etc...seems we should have a check-in/out system and/or a database. And how future observers are to know when there is a good single-pulse candidate to follow-up on (aside from word of mouth). It seems we really need to improve coordination of this. (Or maybe everyone but me knows these things and they just need to be better publicized!)

    In any case, it is clear that we need to mark which single-pulse candidates to follow-up more carefully, and also remember that just because a single-pulse candidate doesn't show up in the FFT search does NOT mean that it is not a known pulsar. Some known pulsars may be too far off beam to detect their regular emission, but a bright pulse may be detectable. And, with these short integrations, some known pulsars with longish periods may just show up with higher S/Ns in the single-pulse search.

    Full-processing single-pulse search code

    The quicklook processing program uses the single-pulse search implemented in sigproc. This code is derived from the Cornell code and implements a matched filtering approach. It works well at detecting single-pulse candidates but returns many pulses for each real pulse. This can be advantageous for recognizing weak sources, but means that a further stage of filtering needs to be done before calculating periods or pulse amplitude distributions etc. I have programs to pick out the "best" pulse for each group of pulses and am happy to share these. I also have programs for calculating periods that I am happy to share. However, I don't think these types of programs need to be run as part of quicklook, but only if we find a source of single pulses.

    The quicklook plots are generally pretty clean (generally! there are of course some very obvious exceptions). I do have a program (very simple program that anyone could write) to remove pulses that are strongest at 0 DM from the plot. This would help to clean up noisy plots and I am planning on incorporating this algorithm in the plotpulses package that is party of sigproc. Note that currently the quicklook processing plots results using old .sm macros instead of the newer plotpulses package that is now part of sigproc. The plotpulses package is more versatile and easier to use and makes nicer plots and should be incorporated into the scripts at the next quicklook script overhaul.

    The Cornell people (i.e. Jim and Julia) are using two types of single-pulse code in their long processing: the old matched filtering and a newer friends of friends approach. The friends-of-friends approach is more sophisticated and results in the report of only one detection for each real pulse. Thus it eliminates the need for a second stage of filtering. However, I found that it is sometimes more difficult to recognize candidates using this approach. We really need to do a careful comparison of the two techniques.

    PRESTO also has a single-pulse search algorithm. My experience with PRESTO indicates that its results and output are very similar to the output from the matched filtering sigproc routine. I do now know the "innards" of how PRESTO single-pulse searching works, however. We should compare its algorithm and results with the two Cornell approaches and hopefully use all three to come up with a single-pulse search routine that we are all happy with.

    One thing we should be careful about is to right from the beginning make sure we are sensitive to very broad pulses. The code used for our PKMB reprocessing was only optimized to detect pulses up to 32 ms in width and, indeed, the widest pulses we detected were of about that width. We should try and go higher with the ALFA code.

    Fast-folding algorithm

    We should try and implement an FFA algorithm into the full processing. We have experimented with the algorithm of Peter Mueller and implemented by Michael Kramer and find that for long-period pulsars (period > 3 s or so) it does result in higher signal-to-noise detections. Our postdoc, Vlad Kondratiev, is working on an updated algorithm which is significantly faster. When this is complete, we should include this as a standard part of the full processing, using it to search periods from 3-12 s or so. Vlad's algorithm takes roughly the same amount of time as the FFT. Not a large investment of time for what could be some exciting discoveries.

    Report on the AUSAC meeting

    Scheduling logistics (David N)

    The observatory would prefer to schedule allocate PALFA time (and other consortium allocations) as one big program and let us decide how to divide the time up between initial survey, follow-up, etc. This seems reasonable. The details have not been worked out, e.g., do all observations go under one project code, or can we retain multiple codes? It is awfully convenient for us to have things split on the schedule, and in cima, between several codes: p2030, p2177, p2180, etc.

    One surprise that came out of conversations during the meeting was that the observatory did not recognize that the 1903+03 proposal was a consortium proposal!

    Sceptical review of P2030 (Scott)

    One of the things that came up at the Arecibo Users meeting last week that David and I attended was the follwing sceptical review of the last survey proposal. David and I were very surprised to see this as we didn't even know that it existed! (note the date on it)


    Arecibo Proposal P2030: An ALFA Survey of the Galactic Plane Request for Continuation

    Skeptical Review Panel Summary

    September 14, 2006

    On July 14, 2006 the Skeptical Review Panel met via teleconference to consider the request for a continuation of observations propsal P2030. The Skeptical Review Panel recommendations are advisory to NAIC.

    The Skeptical Review Panel is impressed by the effectiveness of the multi-institutional, cooperative, organizational structure of the PALFA survey. However, the panel notes the following with concern:

    1. Data Processing Pipeline The PALFA consortium is struggling to bring their data processing pipeline up to the capabilities needed to keep up with the data input to the pipeline. It is clear that search algorithm development is much slower than the Panel had originally hoped it would be.

    2. Pulsar Discovery Rate The Panel is disappointed in the rate at which new pulsars are being discovered. Much of the reason for this may be attributable to the immature search algorithms. Certainly the fact that essentially all of the pulsars discovered to date were discovered with the relatively primitive on-line search and display system reinforces this conclusion.

    3. Software Demonstration The Panel recommends that the award of further observing time be predicated on a demonstration that a standardized off-line search algorithm procedure is working with the capabilities needed to keep up with the survey data taking. This demonstration must include detection of a significant number of pulsars in PALFA archival data that were not found by the on-line system (otherwise it is pointless to invest any effort whatsoever in off-line analysis). A member of the Panel remarked that, with the new spectrometer arriving in a few months, work on search algorithms presently seems to be of higher priority for the PALFA consortium than is additional search time on the telescope.

    4. Timing of Newly Discovered Pulsars The Panel recommends that NAIC support programs to time the interesting pulsars discovered in the survey. In fact, some members of the Panel felt that until the new spectrometer is commissioned, follow-up on pulsars discovered from the data already acquired is of greater importance than obtaining additional 100-MHz bandwidth data with the WAPPs.


    Anyways, there are a couple important things about it:

    1. I personally think that they are right on target. I'm happy to accept a big chunk of responsibility for our slowness in getting the pipeline running. Hopefully this will act as a much-needed kick-in-the-pants for the rest of you as it already as for me. Bottom line: processing the data properly is a _lot_ of work! We need more (and high quality) help!

    2. The proposal that generated this was the first of our "annual" renewal proposals. The "annual" is in quotes because we are already 6+ months overdue for another proposal if they really are annual. Bottom line is that we are going to need to produce some kind of report to NAIC soonish, and we had better have some solid results to show.

    3. With regards to point #4, I think that it is likely that we will be given permission to start using our p2030 time to time pulsars that we feel are important.

    Anything else???

    Jim: There is a crunch on data disks for shipping raw data around. Some, I think are hung up at processing sites other than Cornell (UBC, McGill?). So I just bought 10 750 GB disks and had them shipped to AO along with disk trays to aid the throughput on upcoming runs. We do need to satisfy the continuity equation by equalizing re-shipping of disks with received ones!