Details of design and specification documentation -- ? (how, where, format)
ALMA Cycle 7 observing modes meeting discussions, mostly in SCIREQ tickets
MS / MMS mode use case
TBD when to use for ALMA: calibration pipeline, imaging pipeline, or both, also see below for data size issues
MS / MMS restoredata / flagmanager operations
Pipeline can record MS / MMS mode in pipeline manifest and flagversions comment attribute and restore flags using original MM / MMS mode if need be
Spw mapping by name heuristics implemented
VLA / VLASS flux.csv file use case discussions, develop prototype for VLA catalog query service using VLASS data
We have run some preliminary tests to compare the processing of several jobs of the ALMA pipeline simultaneously in serial on a single node versus running it in parallel using MPI. Theoretically, we know that the time and memory to process an ALMA project through the pipeline cannot be easily predicted compared to VLASS. VLASS is a good candidate for running in batch mode using simultaneous serial runs, but the ALMA projects vary considerably and cannot be a priori estimated.
Nevertheless, what we have seen is that for the ALMA pipeline:
A) small datasets (say < 12-24h) running in batch mode (several jobs running simultaneously in serial mode) can be processed with 10-20% loss (increase in runtime, on CV modern nodes), compared to a single serial run of the same dataset. One needs to note that using all the available memory should be avoided, but leaving some % free. Problem is this % is variable with the dataset, MS size, etc. MPI will not help much here.
B) for medium size datasets (say up to 1-3 days), one can improve performance the following way: pack a few simultaneous jobs in a same node and also run them in parallel mode using MPI. This would be the most efficient way to use the memory and cores available in a single node. Example: pack a batch of 4 T042-like jobs: speedup of the batch with 4 simultaneous serial runs is ~3.6x. Add-on: use also 8 cores (MPI) for each simultaneous run: overall speedup for the batch is: ~10.2x.
C) As the size/runtime of datasets increases (say ~1 week), MPI alone (no simultaneous runs in a single node) becomes preferable, as it will tend to give higher speedup for the available resources (memory and cores), and also shorter wait times for individual datasets.
In general there is a memory/cores tradeoff. Memory requirements constrain the number of simultaneous jobs. Memory use needs to be re-evaluated after recent improvements in findcont.
Testing using new changes in FindCont.... very early signs that there is a significant improvement in memory use. Sensitivity to differences in imaging weights in parallel vs. serial might be still there. Tests running...
The following was triggered by testing test_initweights in parallel (CAS-11262)
Can we commit the changes to the parameter in mstransform, which will effectively not create a WEIGHT_SPECTRUM if
input MS does not have one and usewtspectrum=False (default case in mstransform)
This is needed for partition, which uses mstransform to create an MMS. Currently, partition creates the WEIGHT_SPECTRUM in any Multi-MS even if the input MS doesnt have it. This causes an undesirable difference between serial and parallel processing.
uvcontsub in parallel (CAS-10697)
The validation of small changes in task_uvcontsub has raised many other issues in that task, therefore I told Ryan that we need a decision on how to proceed. Fixing old uvcontsub sounds couter-productive if we are going to finish the implementation of a new uvcontsub in mstransform. But we need to understand if VLA needs old uvcontsub working in parallel sooner than that.
Test "Alma M100 Analysis Regression tclean" has been failing last few days. Small increase in output image RMS (~1%). Is this expected from recent changes in tclean? Would CAS-9004, automasking, etc. explain this?
Continued prototyping per ms imaging heuristics for check sources
ALMA SCIREQ ticket discussions
VLASS / VLA flux.csv usage discussions with Kana, Claire, Juergen
CAS-10738 Ephemeris related changes. Remerged with master to regenerated branch tarballs. Bryan is expected to be testing this now. A series of related ephemeris tickets are being checked. It would be nice to include this change in 5.3.
CAS-10386 Add shared ASDM code to support two new tables. This is done and merged into master.
CAS-10550 asdmsummary documentation added to plone. Also minor changes to inline documentation. Done, merged, and waiting on documentation validation.
CAS-10693 Remove boost dependency. A badly edited file was found by ALMA during checking of the corresponding ticket on the ALMA side. Those changes have now been accepted and merged into the ALMA code. I expect these changes can be made to the CASA code shortly after 5.4 development starts.
Investigated reports from TelCall of a possible bug in the shared ASDM code. I can't reproduce and I suspect this is an issue with how TelCal is using the shared code.
Spent some time tracking down and looking at the SDFITS to MS conversion that was part of ASAP and was not migrated to CASA when ASAP was removed. A few GBT and other single dish users were surprised by this loss. It's not yet clear if there's enough GBT interest in reviving this functionality. It's unclear how much of that old code could be reused as it appears to involve a translation through the ASAP flat-table structure, so it was not a direct SDFITS->MS conversion. If this functionality is needed it may be better to start from scratch - which is a fairly substantial job and so may be hard to justify at this point.
Federico M Pouzols
CAS-10937 - "MPI CASA: tclean creates new directory and does not parse the dirname correctly" - fixed small hidden issue in output path handling in tclean (parallel helpers).
doc/discussion ticket as a follow-up to CAS-11269 - "mstransform should not create WEIGHT/SIGMA_SPECTRUM in output MS if they're not available from the input MS and usewtspectrum=False"...
CAS-11301 "Creation and initialization of WEIGHT/SIGMA_SPECTRUM columns in various tasks, and intended use in pipeline, user scripts, etc"