Table of Contents

NOTIFICATIONS: READ THIS FIRST

  • This wiki is a guide for manual data reduction (calibration and imaging). To perform pipeline-assisted imaging, please refer to the ppl-assisted imaging wiki.
  • If you are familiar with the process, you could use this condensed checklist here
  • Unless told otherwise by your supervisor, your data reduction assignment is your primary functional responsibility and should be completed as quickly as possible. Please let the DRM know if you foresee difficulties in completing your assignment, so that the dataset can be re-assigned if necessary. Data reducers are not allowed to reduce data from projects where they are a PI or co-I; if you are assigned data from such a project, please let your DRM know.
    • Arielle Moullet is the data reduction manager (DRM), Jen Donovan Meyer and Mark Lacy are deputy data reduction managers, and Catarina Ubach is the Head Data Analyst. They should be your primary sources of advice if questions come up. Crystal Brogan and Todd Hunter are our data reduction specialists in Charlottesville and have also agreed to be available for questions.

Preparing to Start

When the data has been placed into your Lustre area, the data analysts (generally Dongchan for the science staff; the DAs will stage their own data) will notify you through a comment on the project's SCOPS data reduction ticket. Communication with the DRMs is done through that ticket.
  • Each project has a "Data Reduction" SCOPS JIRA (DR) ticket and a "Data Taking" SCOPS JIRA (P2G) ticket (linked from the top of the DR ticket). The DR ticket is where you will communicate with the DRMs about progress/problems associated with the data reduction. Ask your supervisor to gain access to the JIRA system if you don't already have it.
  • Your assignment will be given to you via a comment (which you should get by email) on the JIRA SCOPS data reduction (DR) ticket of the project. The comment will include the project code and the MOUS name (MOUS = section of the project to be individually processed).
  • The ALMA Project Tracker contains details about the corresponding observations (weather, technical issues). Use the "project" search tool to find your data set, then, click on the project code to select it, then click on the "Project Report" button to generate the PDF summary of your project (requires special privileges).
  • The detailed description of the observational setup of your MOUS can be accessed through the ALMA's Observing Tool. Use the Cycle 2 OT for a Cycle 1/2 project. If you have user privileges, you can open the full description of the project directly in the OT, by searching the project code in the project archive. Otherwise you can download the latest .aot file attached in the project's P2G ticket, and open it with the OT. All the relevant information for the imaging products to produce will be found there; more on this below under Getting started.
  • If you have been assigned a manual reduction that was originally run through the pipeline and failed ('pipeline fail'), it may be helpful to look at the CAS ticket reporting the bug, and the Weblog corresponding to the attempted pipeline calibration; ask the DRM for those if you did not get them.
  • Several tickets are used to report problems with the script generator. Make yourself a watcher on these tickets:
    • CSV-2809 to keep track of changes to the data reduction script, and obtain the latest Best Practices (note July 2015: latest Best Practice Manual not uploaded yet - only old versions available)
    • CSV-3013 to report bugs in the script generator
    • CSV-3014 to request improvements to the script generator
    • CSV-2903 to report bugs in the Cycle 1 (and Cycle 2) QA2 report generator

Computer related set up:

Getting Started

Time to get started! In general, keep in mind as you are confronted with questions:
  • Documented issues come up all the time, and current tips and fixes, as we determine them, are located here: FAQ
  • Ask the DRMs and data reduction specialists
  • For questions that will be of general interest to data reducers, use the data reducers email list: send email to: science_datared @ alma.cl . Subscribe by going to: https://lists.alma.cl/mailman/listinfo/science_datared

Typically, you are assigned a member OUS (MOUS) which is considered to be completely observed (this consideration may change depending on your quality assessment). One or multiple Execution Blocks (EB) corresponding to the MOUS were observed, and each of them is recorded in a ASDM directory. First of all, you should familiarize yourself with the MOUS. From the name of your assigned MOUS, you can identify the corresponding Science Goal in the OT, and find out the science goal spatial and spectral setup (as well as the requested rms noise and resolution, which will become important later during the imaging step). The latter must be taken from the Control and Performance OT tab, not the Technical Justification tab.
In the field setup of the OT, the tabs at the top of the window (below Spectral, Spatial, Field Setup tabs) each indicate one independent source (to be imaged on its own). Each source may be a mosaic or a single pointing. In the field Center Coordinates section, look at the number of pointings. Any number of pointings greater than one indicates a mosaic setup. Mosaic fields of a given source must be imaged together.

Several MOUS may belong to a single science goal (usually for Science Goals requiring observations in multiple arrays): identify if that is the case, and in particular whether your assigned MOUS is a compact MOUS (7m array or 12m compact array) of a project including a more extended MOUS.
Look in the SCOPS P2G at the table at the top of the ticket. If several MOUS are listed with the same trunk name as your assigned MOUS, but different endings (typically _TE, _TC, _7m), then several MOUS belong to the same science goal. An MOUS name ending in _TC corresponds to a 12m compact MOUS, an SB ending in _7m to a ACA MOUS

Now set up your directories:
  • Go to your lustre area: cd /lustre/naasc/your username
  • The data you have been assigned to reduce, i.e., one ASDM directory for each execution block (EB) of the assigned MOUS, should be there.
  • The data packaging script expects a certain directory structure, which you will need to create. Here is an example of the directory structure and final contents (in your lustre/naasc/your username area).
    • ./uid___A001_X13e_X1fe (where uid___A001_X13e_X1fe is the MOUS UID, where the '/' and ':' are replaced by '_'. Example: uid://A001/X13e/X1fe --> uid___A001_X13e_X1fe). Note: the MOUS UID can be found in the SCOPS-data reduction ticket comment announcing your assignment.
    • ./uid___A001_X13e_X1fe/Calibration_X001 (where X001 is the final extension of the first EB UID, as in X2cd).
    • ./uid___A001_X13e_X1fe/Calibration_X002 (where X002 is the final extension of the second EB UID).
    • ./uid___A001_X13e_X1fe/Calibration_X00N (where X00N is the final extension of the Nth EB UID).
    • ./uid___A001_X13e_X1fe/Combination
    • ./uid___A001_X13e_X1fe/Combination/calibrated (note lower case). This directory is only necessary if you have multiple executions.
    • ./uid___A001_X13e_X1fe/Imaging
    • The data packager expects certain file and directory names so do not edit the names of files/directories created by the various scripts.

Manual Calibration

The following steps should be performed for each ASDM of the assigned MOUS to do an initial calibration:
  • Move the ASDM into the appropriate ./Calibration_XXXX directory (which you created)
  • Generate the data reduction script at the CASA prompt by typing es.generateReducScript('asdmname').
  • This will generate a python file '<msname>.scriptForCalibration.py' in your current directory and will also create a CASA measurement set (.ms directory)
      • If the script generator fails on an ACA dataset because it does not find a suitable reference antenna, fix this by specifying the reference antenna in the es.generateReduceScript command, e.g. es.generateReduceScript('uid_XXX',refant='CM02')
      • If es.generateReducScript did not successfully create a measurement set from the ASDM, then here are the instructions for importing data.
        • At the CASA prompt: importasdm(asdm='asdm_name', asis='Antenna Station Receiver Source CalWVR CalAtmosphere ')
        • run this command in the directory that contains the ASDM. In the context of the example above, that will be ./uid_A001_X13e_X1fe/Calibration_X001
        • this will generate a measurement set named, in the example, uid___A...X001.ms
  • The script is separated into several steps. The steps can be run all at once or piece by piece by setting the global variable "mysteps" at the CASA command line before executing the script. So to run only steps 0, 1, and 2, type mysteps = [0,1,2]. To run all the script at once, use mysteps = [].
  • If your target is an ephemeris object, please see here for instructions on how to attach the ephemeris table.
  • Run the data reduction script in CASA 4.4: execfile('<msname>.scriptForCalibration.py') with the following change to step 0:
    • In step 0, a measurement set is created; add a os.system('rm -rf [ASDMuid].ms..flagversions') to delete the .flagversions directory just before the call to importasdm.
    • Once you have run listobs (within the script), you can check whether the project matches what you expected in the OT.
      First of all, check that you are looking at the right SG tab in the OT, and whether you are dealing with one MOUS of a multi-array SG. For example if you are dealing with a _TC component of a SG which has both _TC and _TE components, it is probably OK if the observations were performed in a configuration too compact to reach the OT requested resolution. Otherwise, it is possible that the original requested SB specs had been further modified by P2G before observations (due to change requests for example), and hence that the observations do not completely match the original request. There are multiple ways to check that. In the SCOPS-P2G ticket of the project, each change to the original SB is documented (look for comments containing: 'Incremented version to '). If you have advanced privileges in the OT, those comments are conveniently gathered in the 'project notes' - in the proposal tab, click on the proposal title in the left-side column, project notes can be found below 'Main Project Information'. The advanced privileges may be enabled in the OT by going to File, Preferences, Advanced, and clicking on 'Enable privileged operations'. If you cannot have these privileges, you can ask your DRM to send you the project notes. Finally, the exact specifications which were used for the observations can be found in the OT, within 'SG OUS', which is the last component (below Technical Justification) in the SG tree structure. Look in the 'instrument setup' section to look for spectral setup changes.
  • You should follow as closely as possible the script produced by Eric's script generator, but you may need or prefer to modify it according to your review (putting flags, changing gain calculation, changing spw mapping, ...) and re-execute script steps.
    • You should reload the flag state from before the task was run the first time. Indeed several tasks (in particular applycal calls) modify the flag state. The script contains many calls to 'flagmanager', which takes a snapshot of the flag state at a given point in the calibration workflow. To reload one of these flag 'snapshots', you can use flagmanager with mode = restore. Before re-running the whole script, please remove the .ms directory and flagversions directories, so that the ASDM is re-imported from scratch.

  • Check for error/warning messages sent to the CASA window or logger.
  • Check the png files automatically generated by calibration script. These are usually calibration tables, so check that these look continuous and have low scatter.
  • Check the before/after WVR solutions for antennas. This is done by looking at the tables at the end of the uid*.ms.wvrgcal directory in the calibration directory. To get the wvr corrections for antennas with bad WVRs, in step 4, you may need to interpolate the WVR corrections for antennas that have worse phases after WVR. To do this, you add the following parameter to the wvrgcal command in mystep = 4: wvrflag=['DA59'], .
  • Check the *.ms.split.phase_int.plots and *.ms.split.bandpass.plots that are generated in the script.
  • Check directly the solutions (tsys, bpass, gain) using the plotcal and plotbandpass commands.
    • Tip: Every task -- including running plotcal, which is run inside the various checkCalTable scripts -- can be run directly from the CASA command line, which can be helpful when troubleshooting to better understand and check the effects of changing individual parameters. In the CASA window, type (for instance) "help plotcal" to read about the task and its inputs and "inp plotcal" to see the task's current settings and "go" once the inputs are correct.
  • It is good practice to verify whether the fluxes in the setjy calls are sensible. For grid source fluxes, the calibrator database can be queried using au.getALMAFluxForMS or au.getALMAFlux. For datasets with a non-Solar System object as an absolute flux calibrator, you can use these tools to determine a reasonable value of the fluxes for one or more quasars in the measurement set. If you use this value to set the flux of the flux calibrator, please use the field number (not name) for easier comparison with pipeline reductions.
  • At the end of calibration, use plotms to check that plots of the corrected amplitude and phase vs frequency and time are flat (on the phase and bandpass calibrators). The MS most useful to look at is uid_XXX.ms.split.cal.
  • You can get a sense of the flagging rate at different stages of the calibration by using amc.getFlagStatistics: https://safe.nrao.edu/wiki/bin/view/ALMA/GetFlagStatistics

Given that this assignment is being manually calibrated instead of pipeline calibrated, it is very likely that your observation was in a non-standard mode or will require some extra effort for successful calibration. The following list of tips and tricks has been assembled to help; if any apply to your calibration, please follow the links here as applicable (and note that often these are moving targets!):

  • Many documented issues, and current tips and fixes as we determine them, are located here: FAQ (ignore the "Cycle1" in the name; it's being used for Cycle 2 as well)
  • If the observation switched between a broad bandwidth for the calibrators and very narrow bandwidth for the science targets, the auto-generated script will require some significant modifications which are described here.
    • sometimes the 7m data can lack ATM cals for the science SPWs in BWSW mode. (There is a workaround for this i think....)
  • Very narrow bandwidth (or low SNR on the calibrators) science observations are likely to require modifications discussed at low signal to noise in calibrators (this page is in the process of being updated but the information there is still helpful). A good indication this is the case is if the FLUXSCALE results for the narrow SPWS are systematically different from those for the wide SPWs. Band 8 and 9 datasets can especially be affected by low SNR on the calibrators. Todd's suggestions on how to handle those are gathered at CAS-7400; see also the recommendations here in the FAQ.
  • Getting around problems with Tsys
  • Getting around problems with WVR
  • If you encounter other issues:
      • Todd's analysisUtils package contains a wealth of diagnostic tricks: Documentation of tasks in analysisUtils
      • Bandpass corrections exhibit large amplitude corrections (greater than 5% channel-to-channel or so): you may want to try to re-do the bandpass solution with a larger solint (for frequency, not time -- example: solint='inf,1MHz') or smooth the bandpass solution.
      • Quacking: it happens fairly often that the first few integrations in a scan appear to have a much lower amplitude than the rest of the scan. This can be dealt with by flagging manually the incriminated timeslots (which can be tedious) or by setting tbuff=0.5*integration time in the importasdm call of the script (see CSV-2757)
      • If python bombed at some point, or if your script dies during various plotting commands (complaining about latex or variables or some such), then something's up with the files in your ~/.casa directory. So quit CASA and do the following: cp ~/.casa/init.py ~/.; rm -r ~/.casa. Then get back into CASA, exit again, and type cp ~/init.py ~/.casa/.. Get back into CASA and proceed as you were.
      • For datasets with a Solar System absolute flux calibrator, it is common to use a nearby quasar for the pointing. When the script flags the POINTING intents data, this quasar is completely flagged (as it should be). However the final applycal in the script typically includes this calibrator. Since it has no data, an error message of the form "SEVERE applycal::Calibrater::selectvis (file /var/rpmbuild/BUILD/casapy/casapy-33.0.16856/code/synthesis/implement/MeasurementComponents/Calibrater.cc, line 396) Caught exception: Specified selection selects zero rows!" will appear. Just remove the field id of the pointing calibrator from the applycal.
  • Search the JIRA PRTSPR tickets, which are used to report problems on the data (https://jira.alma.cl/browse/PRTSPR/)
    • If you find a new problem with the array performance or antennas on recent data (less than a month old) you can create a PRTSPR ticket (see ProblemReport)
  • Search the JIRA CAS tickets , which gather known CASA issues (https://bugs.nrao.edu/browse/CAS)
  • If you find a bug in the script generator, report it here CSV-3013

Once you're happy with the calibration:
  • Review the final products: .split.cal visibility directory, calibration tables
  • Annotate your script with information that may be helpful to the PI (calibrator fluxes, reasons for flagging baselines/antennas, deviations from original script), but do not provide personal commentary of any kind. Please highlight sections which you modified using your initials.
  • If you ran the script steps piece by piece, run the script all together to generate a clean log file first removing the .ms directory and flagversions directories, so that the ASDM is re-imported from scratch
    • es.generateQA2Report('uid___whatever.ms','uid___whatever.ms.split',refAnt='antenna_name')(note the capital "A" in refAnt)
    • Choose the same reference antenna as you actually used for your calibration
    • This will run the QA2 report commands and place the results in a subdirectory called "qa2"
    • If it crashes in target_spectrum() due to no data found, then it is probably due to the default uvrange limit ('0~30m') being too small. This can be overridden with:
      • es.generateQA2Report('uid___whatever.ms',uvrange='0~300m')

  • Review the png and txt files produced by the script and put in the qa2 directory - these will be sent to the PI. The textfile.txt file has a lot of useful information and information that is useful for deciding imaging parameters.
  • If there are multiple ASDMS in your assigned MOUS, copy the uid*ms.split.cal directory and uid*.ms.split.fluxscale directory into the Combination/calibrated directory

Combination (only if more than one ASDM per MOUS)

The following steps should be performed in the Combination/calibrated directory (where you already copied all uid*.ms.split.cal directories and all uid*.ms.split.fluxscale directories)
  • Type es.generateReducScript(['uid_FIRST-EB.ms.split.cal','uid_SECOND-EB.ms.split.cal',(etc)], step='fluxcal')
    • This produces two files: allFluxes.txt, scriptForFluxCalibration.py
      • allFluxes.txt lists the measured and weighted mean fluxes in each spw for each phase calibrator in each dataset. The first flux value is the valued calculated from the individual ms; the second flux value is the weighted mean flux from all ms's
      • The python script, by default, uses these fluxes to scale the individual datasets based on the the weighted means and concatenates them into a single MS.
  • If you don't want to scale the fluxes in each individual datasets, just delete the setjy, gaincal, and applycal commands from the script, leaving just the concatcommand at the end, and run the script: type execfile('scriptForFluxCalibration.py')
    • This leaves the flux scale for the individual EBs as determined for each EB (no additional scaling). One good reason for doing this: the time between each EB is long enough (more than a few days) that the phasecal flux may have varied significantly. To determine the flux of a calibrator at a given frequency, remember the tasks au.getALMAFluxForMS and au.getALMAFlux
  • If you do want to scale the fluxes, check that the the values in allFluxes.txt are correct with respect to the .fluxscale directories and that they are sensible (i.e., windows at nearly the same observing frequency should have similar values or they should follow the phase calibrator's expected spectrum as derived from the fluxscale task).
    • If they are not correct, you should edit the file to put in the correct average flux for each SPW.
      • You can find the phase calibrator flux for each individual EB in the .ms.split.fluxscale directory
      • Average these together to get the mean flux for each SPW
      • Enter these values into the allFluxes.txt file. It should look like this:
        "J1626-2951" 0 230.56 1.02 1.02 2014-03-23T06:42:25 root_path/Combination/calibrated/uid___A002_X7d44e7_X13d1.ms.split.cal "J1626-2951" 0 230.56 1.02 1.02 2014-03-24T06:20:02 root_path/Combination/calibrated/uid___A002_X7d6d46_X38a.ms.split.cal "J1626-2951" 1 232.63 1.02 1.02 2014-03-23T06:42:25 root_path/Combination/calibrated/uid___A002_X7d44e7_X13d1.ms.split.cal "J1626-2951" 1 232.63 1.02 1.02 2014-03-24T06:20:02 root_path/Combination/calibrated/uid___A002_X7d6d46_X38a.ms.split.cal "J1626-2951" 2 245.43 1.01 1.01 2014-03-23T06:42:25 root_path/Combination/calibrated/uid___A002_X7d44e7_X13d1.ms.split.cal "J1626-2951" 2 245.43 1.00 1.00 2014-03-24T06:20:02 root_path/Combination/calibrated/uid___A002_X7d6d46_X38a.ms.split.cal " 
      • Regenerate scriptForFluxCalibration.py using the new allFluxes.txt - just type es.generateReducScript(['uid_FIRST-EB.ms.split.cal','uid-SECOND-EB.ms.split.cal'], step='fluxcal') again
    • When you have the correct allFluxes.txt and scriptForFluxCalibration.py, type execfile('scriptForFluxCalibration.py')

You now have calibrated.ms in the Combination/calibrated directory. Check whether the combined dataset has correctly defined spws and fields: If running listobs on the combined dataset shows more fieldIDs than expected (e.g. mosaics taken on different days resulting in individual pointings with small differences in their field centers), edit the concat command in scriptForFluxCalibration.py to include dirtol (to reduce #fields)
  • You may also end up having more spws than expected (probably by a factor 'number of ASDMs') because of different Doppler settings for each ASDM. See the next section (Imaging) for how to deal with this.

Imaging

The commands to image your manually calibrated data are largely the same as those found on the pipeline assisted imaging wiki with only a few exceptions. Here's how to get started:
  • Copy the .ms directory to image in the Imaging directory. In the case of multiple ASDMs, this is the calibrated .ms dataset in the Combination/calibrated directory. In the case of a single ASDM, this is the .split.cal directory in Combination/calibrated (or in the Calibration_X***) directory.
  • Follow the instructions at pipeline assisted imaging wiki, starting at #5, with the following changes:
    • Remove calls to concat, cvel, and mstransform. These steps are not relevant for the case of manual reduction, since EB concatenation, if necessary, should have been taken care of in the Combination step.
    • Make sure to do your imaging using the same version of CASA that you used for calibration (currently CASA 4.4).
  • After imaging is complete (through step #11 on the imaging wiki), proceed to the final checklist below.

Final Checklist and wrap-up

To complete the delivery, you need to have for EACH execution (each ASDM):
  • A "clean" casa log file that contains all the steps run. These need to be in the individual Calibration_X*** subdirectories or they will not be picked up by the packaging script. If you have additional log files in the directory it is best to delete them at this stage.
  • A complete script with a name like uid___A002_X3c7a84_X443.ms.scriptForCalibration.py that contains all the steps you ran and any important brief notes you think the PI would need.
    • Make sure that all steps of the script will run (e.g., thesteps=[])
  • A qa2 directory

To complete the delivery, you need to have for the assigned MOUS:
  • A "clean" casa log file that contains imaging steps run (if it is doable to run all the imaging steps in one go), and a log file that contains all the combination steps. These need to be in the Combination/calibrated and Imaging directories respectively, or they will not be picked up by the packaging script.
  • A README file
  • A scriptForImaging.py file in the Imaging directory (and, in the case of multiple EBs, a scriptForFluxCalibration.py file in the Combination/calibrated directory)
  • .pbcor.fits and .flux.fits files in the Imaging directory

Add a note to the SCOPS data reduction ticket telling the DRM that your reduction is ready for review and telling her/him where to look for the data / QA2 plots / scripts
  • Note on the ticket any issues with the data or the data reduction, and whether the sensitivity and spatial resolution achieved reach the proposal requests. Note that in the case of multi-array SG, you cannot directly compare the achieved rms to the SG request.
  • copy the README on the ticket, and attach the calibration scripts.
  • If necessary, inform the contact scientist of additional information to be communicated to the PI or P2G group

As with pipeline calibrated data, once the DRM approves the MOUS as QA2 pass/ QA2_Semipass, the data will be assigned to DAs for packaging and delivery.
  • Keep the imaging, calibration and combination scripts in a safe place - they may be still useful years later
  • Information on the data delivery date can be found in the data reduction spreadsheets: Cycle 1, Cycle 2
  • After the data is delivered, please move your entire data reduction package (including the README) to the /lustre/naasc/deliveries directory. For example, mv uid_A001_X13e_X1fe /lustre/naasc/deliveries
  • For data that is not delivered (QA2_FAIL), please attach the imaging script to the SCOPS ticket and you can move the data to /lustre/naasc/qa2fails
Topic revision: r3 - 2015-09-16, JenDonovanMeyer
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback