CASA Flagging tool 3.4 - Guide for beta testing

Purpose and Scope

There are 2 new tasks tflagdata and tflagcmd, which provide an extended version of the flagdata and flagcdm functionality, using a completely new C++ implementation underneath, designed for better performance and maintainability. We recommend to test the following new functionality, mostly related with auto-flagging algorithms.

Notes to keep in mind before starting

  • Remember to unflag the data before each flag command, in order to isolate the results of each test:

tflagdata(vis='data .ms', mode='unflag')

  • The logs will inform about the data flagged per chunk and MS in total, we recommend taking notes of these percentages to keep track of what flagging commands actually have an impact on the data:

FlagMSHandler::nextBuffer Chunk = 3, Observation = 0, Array = 0, Scan = 31, Field = 1 (3C286_A), Spw = 9, …
Tfcrop_RR::chunkSummary => Data flagged in this chunk: 3.80157%
Tfcrop_RR::msSummary => Total data flagged in MS: 3.87061%

  • Additionally it is possible to use the summary mode to obtain a detailed break-down of the flags produced. We recommend using summary mode after each flag command, and take note of any unexpected result:

summary = tflagdata(vis='data.ms',mode='summary')
Note: summary is a dictionary whose keywords are antenna, array, correlation, field, observation, scan, spw, flagged, total. To print any breakdown simply access the corresponding keyword (e.g.: ummary['field'], or summary['spw'])

tflagdata-1a,2a

Mode

  • tfcrop: Flag data using time-frequency-crop auto-flagging algorithm which identifies radio frequency interferences in either calibrated or uncalibrated data.
  • rflag: Flag data using rflag auto-flagging algorithm which identifies radio frequency interferences in calibrated data.

Description

With uncalibrated data (tcrop mode) and calibrated data (rflag mode), flag a spectral window which could contain spurious signals, like the WVR LO leakage at 91.66GHz (band 3) affecting baselines of DV09.

Commands

tflagdata(vis='uncaldata.ms', mode='tfcrop',ntime=100.0,spw='91~92GHz',antenna='DV09&&',correlation='ABS_I')
tflagdata(vis='caldata.ms',mode='rflag',ntime=100.0,spw='91~92GHz',antenna='DV09&&',correlation='ABS_I')

tflagdata-3a

Mode

  • clip: Flag null visibility points with the new 'flag zeros' option using different visibility expressions.

Description

  • One of the most common flagging operations consists of flagging null visibility points that may exist for instance when the delays between 2 antennas are not properly set

Commands

tflagdata(vis='data .ms',mode='clip',clipzeros=True,correlation='ABS_I')

tflagdata-1b,2b,3b

Mode

  • Asynchronous I/O: Tests tflagdata-1,2-3 can also be repeated switching on asynchronous I/O to obtain a speed up.

Description

  • Auto-flagging algorithms require reading the visibility data cube, which represents the bulk of the MS data. To improve speed the new flagging framework supports async I/O mode to read ahead visibility data whilst flagging the current chunk.

$HOME/.casarc configuration

To enable asynchronous I/O add the following settings to the $HOME/.casarc file and restart casapy:

VisibilityIterator.async.enabled: true
FlagDataHandler.asyncio: true

  • Note: Take note of the time taken during tests tflagdata-1,2,3a so that you can verify the speed up shown in flagdata-1,2,3b
  • Note: Remember to set these values to false, and restart casapy again after finishing with tests tflagdata-1,2,3b

tflagdata-1c,2c,3c

Mode

  • Display: Tests tflagdata-1,2-3 can also be repeated switching on the display mode, which presents a GUI showing the flagging results in blue data points over time,frequency 2D maps per baseline, separating each correlation product.

Description

  • Auto-flagging algorithms although properly tested are still a bit experimental, and for this users may be interested in inspecting the flagging result, and perhaps fiddle the parameters for better results.

Commands

tflagdata(vis='uncaldata.ms', mode='tfcrop',ntime=100.0,spw='91~92GHz',antenna='DV09&&',display='data')
tflagdata(vis='caldata.ms',mode='rflag',ntime=100.0,spw='91~92GHz',antenna='DV09&&',display='data')
tflagdata(vis='data .ms',mode='clip',clipzeros=True,display='data')
  • Note: Try to navigate trough the various baselines, scans, fields and spw, and take note of any unexpected results. Keep in mind that both-ways navigation is only possible for baselines, but not for scan,field,spw.

tflagdata-4

Mode

  • clip-WVR: Flag WVR data using the new visibility expression capabilities.

Description

  • Uncalibrated ALMA MSs contain WVR data identified with 'spw=0', and containing 'I' correlation products. This data typically requires clipping ranges different from those used for flagging the actual visibilities.

Commands

tflagdata(vis='data .ms',mode='clip',clipminmax=[0,0.001],correlation='ABS_WVR',display='data')

tflagdata-5

Mode

  • shadow: Flag baselines shadowed by antennas which were not used during the observation, although psychically present, thus potentially causing a shadow effect in short baselines.

Description

  • During ALMA cycle 0 campaign only arrays of up to 16 antennas will be used for observation. However there are already ~33 antennas at the ALMA site, which can cause a shadow effect in the short baselines.

Description

  • Note: For this test it is necessary to do a listobs in order to find a 'missing' antenna which nevertheless is already physically present at the ALMA site. In the following example we assume this antenna to be DV09.

input = 'name=DV09\n'+\
'diameter=25.0\n'+\
'position=[-0.000156728637322,-0.000597147624919,-7.49789857005e-05]'
filename = 'antenna-coordinates.txt'
create_input(input, filename)
tflagdata(vis='data .ms',mode='shadow', tolerance=10.0, addantenna=filename)
  • Note: Verify the flagging result by plotting visibilities and flags (plotms) as a function of elevation. Flags are likely to be clustered around low-elevations.

tflagdata-6a

Mode

  • Extension: The extension mode allows to extend the flags produced by a previous flag commands (for instance to different polarizations).

Description

  • Auto-flagging algorithms may flag almost the entire time range for a given channel or vice-versa. With the help of this algorithm it is possible to flag the remaining visibility points for better consistency.

Commands

  • Note: For this test it is necessary to do an unflag-autoflag-extend chain, so that the results of the extension are applied only after flagging the data with one single autoflagging algorithm:
tflagdata(vis='uncaldata.ms', mode='unflag')
tflagdata(vis='uncaldata.ms', mode='tfcrop',ntime=100.0,spw='91~92GHz',antenna='DV09&&',correlation='ABS_RR')
tflagdata(vis='uncaldata.ms', mode='extend',ntime=100.0,spw='91~92GHz',antenna='DV09&&',extendpols=True)
  • Note: Afterwards extract a summary, and as a basic test verify that all correlations have the same number of flags:

summary = tflagdata(vis='uncaldata .ms',mode='summary')
summary['correlation']
# The flags per correlation will be printed: {'XX': {'flagged': 100.0, 'total': 1000.0}, 'YY': {'flagged': 100.0, 'total': 1000.0}}

tflagdata-6b

Mode

  • Extension in list mode: tflagdata-6 test can be repeated using list mode, to group all the flag commands in one single run.

Description

  • For speed optimization reasons it is better to group several flag commands in one single run, to minimize iteration and I/O operations overhead.

Commands

input = "mode='unflag'\n"
"mode='tfcrop' ntime=100.0 spw='91~92GHz' antenna='DV09&&' correlation='ABS_RR'\n"
"mode='extend' ntime=100.0 spw='91~92GHz' antenna='DV09&&' extendpols=True\n"
"mode='summary'\n"
filename = 'listfile.txt'
create_input(input, filename)
summary=tflagdata(vis='data.ms', mode='list', inpfile=filename)

  • Note: Inspect the summary and verify that the flag counts are the same as obtained with tflagdata-6b. Additionally you can compare the run time of tflagdata-6a vs tflagdata-6b, which should be faster.

tflagdata-6c

Mode

  • Extension+Display: tflagdata-6 test can be repeated switching on the display mode in the extension step.

Description

  • The display mode presents 2 time/freq maps per baseline, one showing the previous flags (top), and another showing the new flags (bottom), thus it is perfect to inspect the flags produced by the extension after having applied an auto-flagging algorithm.

Commands

tflagdata(vis='uncaldata.ms', mode='unflag')
tflagdata(vis='uncaldata.ms', mode='tfcrop',ntime=100.0,spw='91~92GHz',antenna='DV09&&',correlation='ABS_RR')
tflagdata(vis='uncaldata.ms',mode='extend',ntime=100.0,spw='91~92GHz',antenna='DV09&&',
extendpols=True,growaround=True,flagneartime=True,flagnearfreq=True, display='data')
  • Note: Inspect the pairs of time/frequency maps, and evaluate if the extended flags make the flagging result more consistent.

tflagcmd-1

Mode

  • On-line flags: In the new flagging framework it is possible to group an unlimited number of flag commands in a single run, to improve speed.

Description

  • At observation time some antennas cannot produce good data during particular time ranges due to mechanical errors (antenna not on source, focus error, sub-reflector errors, etc). These problems are annotated and translated into manual-mode flagging commands stored in a file available within the MS as 'Flags.xml'.

Commands

tflagcmd(vis='uncaldata.ms',inpmode='xml',action='apply')

  • Note: Unflag the data and repeat the same test using the old framework (flagcmd), in order to compare speed:
flagdata(vis='uncaldata.ms',unflag=True)
flagcmd(vis='uncaldata.ms',flagmode='xml',optype='apply')

-- JustoGonzalez - 2012-03-22
Topic revision: r2 - 2012-04-10, SandraCastro
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback