CASA Users Committee Pipeline Workshop

Pipeline Processing Machines

Dedicated processing machines are not provided. Workshop participants who wish to exercise the pipeline software should install the current CASA / Pipeline release on their laptops or on machines at their home institutions.

CASA / Pipeline Software Version

The current CASA / Pipeline release requires CASA 4.3.1. The pipeline software is an add-on to CASA 4.3.1 Instructions on how to download and install the pipeline release can be found at

http://casa.nrao.edu/casa_obtaining.shtml

Running CASA in Pipeline Mode

To run the pipeline CASA must be started in pipeline mode. Running CASA in pipeline mode disables the plotting GUI(s) and makes the pipeline tasks visible to CASA.

To start CASA in pipeline mode

casapy --pipeline

To confirm that the pipeline tasks are visible to CASA type the following command

tasklist

A list of tasks with names starting with the following package prefixes should appear on the terminal

  • h_ (pipeline utility tasks)
  • hifa_ (ALMA interferometry pipeline tasks)
  • hifv_ (EVLA interferometry pipeline tasks)
  • hif_ (generic interferometry pipeline tasks)
  • hsd_ (ALMA single dish pipeline tasks)

Standard Pipeline Directory Structure

The automated pipeline creates a processing directory tree which looks like the following

pipeprocdir
|-- request1
|   |-- rawdata
|   |   `-- dataset1.asdm
|   `-- working
|       |-- PPR_request1.xml
|       |-- casa_pipescript.py
|       `-- flux.csv
`-- request2
    |-- rawdata
    |   `-- dataset2.asdm
    `-- working
        |-- PPR_request2.xml
        |-- casa_pipescript.py
        |-- dataset2_flagtemplate.txt
        `-- flux.csv

The purpose of this structure is to provide an independent directory for each pipeline processing request and to keep raw data downloaded from the archive separated from the working directory where the pipeline executes.

pipeprocdir is the root of the pipeline processing directory. Each pipeline processing request generates a request sub-directory below pipeprocdir, e.g. request1.

Each processing request directory contains 3 sub-directories: rawdata, working, and products.

rawdata contains the ASDM(s).

working contains the processing procedure, which may be the pipeline processing request, e.g. PPR_request1.xml, or the equivalent Python script, e.g. casa_pipescript.py. The automated pipeline infrastructure generates the XML request. The pipeline generates the equivalent Python script as it runs. The flux.csv and _flagtemplate.txt files are optional. They contain the best available flux values for the calibrator targets and user defined flagging commands respectively. If present these files are read in as is. If absent they are generated by the import data step.

The products directory (not shown) is parallel to the rawdata and working directories. It is a staging area of final data procuts that will be deleivered to the archive. products is created automatically by the export data task which is the final step of the automated pipeline. This step is NOT normally executed by interactive users.

The pipeline is run in the working sub-directory. At the end of a run working contains the calibrated and flagged MS, calibration tables, images, the web log, the pipeline context, and logs.

The pipeline does not required directory structure. However for the purpose of this workshop we suggest that participants interested in processing data during the workshop follow the standard pipeline directory structure.

Running the Pipeline

The pipeline can be run interactively from a Python script or step by step.

  • From a Python script

    cd .../working
    casapy --pipeline
    casa> execfile ('casa_pipescript.py')

  • Interactively step by step

    cd .../working
    casapy --pipeline
    casa> h_init()
    casa> hifa_importdata(vis=[<vis>])
    casa> ....
    casa> h_save()

The pipeline can also be run in batch mode from a script

    casapy --pipeline --nogui --nologger -c casa_pipescript.py &> pipeterm.output

Some Pipeline Concepts

Pipeline State
  • As it runs the pipeline updates its internal context or state. For example it maintains a record of the calibrations to be applied and how to apply them in an internal calibration library. In this way it similar to CASA tools such as the imager or image analysis tools. Pipeline state is initialized with the h_init task, saved with the h_save task, and restored with the h_resume task. In some case the pipeline state can be modified by the user. The hif_export_calstate and hif_import_calstate enable to user to modify the calibration state of the pipeline. The final ALMA example shows how scripts can be used to do this.

Pipeline Tasks
  • Technically pipeline tasks are identical to CASA tasks. However they will not run until, either an h_init or h_resume command has been executed. Most tasks operate on data set lists. Currently most calibration and flagging tasks operate sequentially and independently on each data set in the input data set list. Imaging tasks operate on all the data sets in the input list.

Parameters and Inputs
  • The pipeline task arguments are divided into 2 groups, algorithm parameters and input parameters. Algorithm parameters are always visible to the user via the usual CASA inp mechanism. These are parameters that the pipeline has decided should be tunable by the user, for example the solution interval of the bandpass solution. Inputs parameters are only visible to the user if the pipelinemode parameter is 'interactive'. The default value of pipeline ode is 'automatic'. The inputs parameters include file name and data selection parameters. The pipeline implements a default file naming scheme and uses data set meta data to drive the data selection, in particular the values of the scan intents.

Data Sets

User

Participants can choose provide their own pipeline web logs and / or ASDM(s) for the workshop.
  • ASDM(s) should be small otherwise pipeline processing times will be too long for the workshop
  • Web logs should be viewable locally or remotely from the user's browser

ALMA

The ALMA workshop materials can be found here

A sample ALMA ASDM is provided below. Note that the Source.xml table for this ASDM is corrupted. This does not impact the pipeline processing but does compromise the pipeline Source and Field table displays.

Standard ALMA calibration and flagging recipe results

Results of a user rerun of the standard ALMA calibration and flagging recipe. The user edited the original calibrator source flux file, added a flagging commands to the standard pipeline flag template file, and redefined the reference antenna list to put the most heavily flagged antenna CM05 at the end of the list.

Results of a more complex 2 part pipeline run where the user inserted an antenna position calibration step after the standard Tsys calibration table flagging step. In this example the actual antenna position corrections for antenna CM05 are incorrect and make the calibration worse.

Preliminary results from the imaging pipeline

EVLA

EVLA materials

Sample VLA ASDM (L-band, single spw)

Results of standard pipeline run

VLA pipeline rerun example (no hanning smoothing or calibrator imaging)

Documentation

ALMA

EVLA

Online
  • CASA online help is available for the pipeline tasks.

-- JeffKern - 2015-10-05
Topic revision: r17 - 2015-10-14, BrianKent
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback