CASA Users Committee Pipeline Workshop
Pipeline Processing Machines
Dedicated processing machines are not provided. Workshop participants who wish to exercise the pipeline software should install the current CASA / Pipeline release on their laptops or on machines at their home institutions.
CASA / Pipeline Software Version
The current CASA / Pipeline release requires CASA 4.3.1. The pipeline software is an add-on to CASA 4.3.1 Instructions on how to download and install the pipeline release can be found at
Running CASA in Pipeline Mode
To run the pipeline CASA must be started in pipeline mode. Running CASA in pipeline mode disables the plotting GUI(s) and makes the pipeline tasks visible to CASA.
To start CASA in pipeline mode
To confirm that the pipeline tasks are visible to CASA type the following command
A list of tasks with names starting with the following package prefixes should appear on the terminal
- h_ (pipeline utility tasks)
- hifa_ (ALMA interferometry pipeline tasks)
- hifv_ (EVLA interferometry pipeline tasks)
- hif_ (generic interferometry pipeline tasks)
- hsd_ (ALMA single dish pipeline tasks)
Standard Pipeline Directory Structure
The automated pipeline creates a processing directory tree which looks like the following
| |-- rawdata
| | `-- dataset1.asdm
| `-- working
| |-- PPR_request1.xml
| |-- casa_pipescript.py
| `-- flux.csv
| `-- dataset2.asdm
The purpose of this structure is to provide an independent directory for each pipeline processing request and to keep raw data downloaded from the archive separated from the working directory where the pipeline executes.
is the root of the pipeline processing directory. Each pipeline processing request generates a request sub-directory below pipeprocdir
, e.g. request1
Each processing request directory contains 3 sub-directories: rawdata, working, and products.
contains the ASDM(s).
contains the processing procedure, which may be the pipeline processing request, e.g. PPR_request1.xml
, or the equivalent Python script, e.g. casa_pipescript.py
. The automated pipeline infrastructure generates the XML request. The pipeline generates the equivalent Python script as it runs. The flux.csv
files are optional. They contain the best available flux values for the calibrator targets and user defined flagging commands respectively. If present these files are read in as is. If absent they are generated by the import data step.
directory (not shown) is parallel to the rawdata and working directories. It is a staging area of final data procuts that will be deleivered to the archive. products is created automatically by the export data task which is the final step of the automated pipeline. This step is NOT normally executed by interactive users.
The pipeline is run in the working
sub-directory. At the end of a run working contains the calibrated and flagged MS, calibration tables, images, the web log, the pipeline context, and logs.
The pipeline does not required directory structure. However for the purpose of this workshop we suggest that participants interested in
processing data during the workshop follow the standard pipeline directory structure.
Running the Pipeline
The pipeline can be run interactively from a Python script or step by step.
casa> execfile ('casa_pipescript.py')
- Interactively step by step
The pipeline can also be run in batch mode from a script
casapy --pipeline --nogui --nologger -c casa_pipescript.py &> pipeterm.output
Some Pipeline Concepts
- As it runs the pipeline updates its internal context or state. For example it maintains a record of the calibrations to be applied and how to apply them in an internal calibration library. In this way it similar to CASA tools such as the imager or image analysis tools. Pipeline state is initialized with the h_init task, saved with the h_save task, and restored with the h_resume task. In some case the pipeline state can be modified by the user. The hif_export_calstate and hif_import_calstate enable to user to modify the calibration state of the pipeline. The final ALMA example shows how scripts can be used to do this.
- Technically pipeline tasks are identical to CASA tasks. However they will not run until, either an h_init or h_resume command has been executed. Most tasks operate on data set lists. Currently most calibration and flagging tasks operate sequentially and independently on each data set in the input data set list. Imaging tasks operate on all the data sets in the input list.
Parameters and Inputs
- The pipeline task arguments are divided into 2 groups, algorithm parameters and input parameters. Algorithm parameters are always visible to the user via the usual CASA inp mechanism. These are parameters that the pipeline has decided should be tunable by the user, for example the solution interval of the bandpass solution. Inputs parameters are only visible to the user if the pipelinemode parameter is 'interactive'. The default value of pipeline ode is 'automatic'. The inputs parameters include file name and data selection parameters. The pipeline implements a default file naming scheme and uses data set meta data to drive the data selection, in particular the values of the scan intents.
Participants can choose provide their own pipeline web logs and / or ASDM(s) for the workshop.
- ASDM(s) should be small otherwise pipeline processing times will be too long for the workshop
- Web logs should be viewable locally or remotely from the user's browser
The ALMA workshop materials can be found here
A sample ALMA ASDM is provided below. Note that the Source.xml table for this ASDM is corrupted. This does not impact the pipeline processing but does compromise the pipeline Source and Field table displays.
Standard ALMA calibration and flagging recipe results
Results of a user rerun of the standard ALMA calibration and flagging recipe. The user edited the original calibrator source flux file, added a flagging commands to the standard pipeline flag template file, and redefined the reference antenna list to put the most heavily flagged antenna CM05 at the end of the list.
Results of a more complex 2 part pipeline run where the user inserted an antenna position calibration step after the standard Tsys calibration table flagging step. In this example the actual antenna position corrections for antenna CM05 are incorrect and make the calibration worse.
Preliminary results from the imaging pipeline
Sample VLA ASDM (L-band, single spw)
Results of standard pipeline run
VLA pipeline rerun example (no hanning smoothing or calibrator imaging)
- CASA online help is available for the pipeline tasks.