Monday Morning Meeting 04/16/18

  • DIAL-IN NUMBERS & PASSCODES:
  • IP: 192.33.117.12##8110
  • Phone: (434) 817-6524

Attendance

  • Socorro:
  • CV:
  • Garching:
  • SCO:

News / Meetings / Visitors

  • Pipeline F2F: Tokyo, 5/7 - 5/11
  • NRAO Users Committee Meeting: Socorro, 5/14 - 5/16
    • 90 min allocated to Brian Glendenning for SW topics
    • 45 min suballocated to CASA/Pipeline - charts forthcoming

Development

Pipeline

  • ALMA Cycle 7 observing modes meeting discussions, mostly in SCIREQ tickets
  • MS / MMS mode use case
    • TBD when to use for ALMA: calibration pipeline, imaging pipeline, or both, also see below for data size issues
  • MS / MMS restoredata / flagmanager operations
    • Pipeline can record MS / MMS mode in pipeline manifest and flagversions comment attribute and restore flags using original MM / MMS mode if need be
  • Spw mapping by name heuristics implemented
  • VLA / VLASS flux.csv file use case discussions, develop prototype for VLA catalog query service using VLASS data

HPC

  • We have run some preliminary tests to compare the processing of several jobs of the ALMA pipeline simultaneously in serial on a single node versus running it in parallel using MPI. Theoretically, we know that the time and memory to process an ALMA project through the pipeline cannot be easily predicted compared to VLASS. VLASS is a good candidate for running in batch mode using simultaneous serial runs, but the ALMA projects vary considerably and cannot be a priori estimated.
    • Nevertheless, what we have seen is that for the ALMA pipeline:
    • A) small datasets (say < 12-24h) running in batch mode (several jobs running simultaneously in serial mode) can be processed with 10-20% loss (increase in runtime, on CV modern nodes), compared to a single serial run of the same dataset. One needs to note that using all the available memory should be avoided, but leaving some % free. Problem is this % is variable with the dataset, MS size, etc. MPI will not help much here.
    • B) for medium size datasets (say up to 1-3 days), one can improve performance the following way: pack a few simultaneous jobs in a same node and also run them in parallel mode using MPI. This would be the most efficient way to use the memory and cores available in a single node. Example: pack a batch of 4 T042-like jobs: speedup of the batch with 4 simultaneous serial runs is ~3.6x. Add-on: use also 8 cores (MPI) for each simultaneous run: overall speedup for the batch is: ~10.2x.
    • C) As the size/runtime of datasets increases (say ~1 week), MPI alone (no simultaneous runs in a single node) becomes preferable, as it will tend to give higher speedup for the available resources (memory and cores), and also shorter wait times for individual datasets.
    • In general there is a memory/cores tradeoff. Memory requirements constrain the number of simultaneous jobs. Memory use needs to be re-evaluated after recent improvements in findcont.
  • Testing using new changes in FindCont.... very early signs that there is a significant improvement in memory use. Sensitivity to differences in imaging weights in parallel vs. serial might be still there. Tests running...
  • The following was triggered by testing test_initweights in parallel (CAS-11262)
    • Can we commit the changes to the parameter in mstransform, which will effectively not create a WEIGHT_SPECTRUM if
      • input MS does not have one and usewtspectrum=False (default case in mstransform)
    • This is needed for partition, which uses mstransform to create an MMS. Currently, partition creates the WEIGHT_SPECTRUM in any Multi-MS even if the input MS doesnÂ’t have it. This causes an undesirable difference between serial and parallel processing.
  • uvcontsub in parallel (CAS-10697)
    • The validation of small changes in task_uvcontsub has raised many other issues in that task, therefore I told Ryan that we need a decision on how to proceed. Fixing old uvcontsub sounds couter-productive if we are going to finish the implementation of a new uvcontsub in mstransform. But we need to understand if VLA needs old uvcontsub working in parallel sooner than that.

Users

Build, Release

Verification Testing

Validation Testing

Architecture

AOB

  • Test "Alma M100 Analysis Regression tclean" has been failing last few days. Small increase in output image RMS (~1%). Is this expected from recent changes in tclean? Would CAS-9004, automasking, etc. explain this?

Developer Reports

Monday Meeting
  • Sanjay Bhatnagar
  • Sandra Castro
  • Lindsey Davis
    • Continued prototyping per ms imaging heuristics for check sources
    • ALMA SCIREQ ticket discussions
    • VLASS / VLA flux.csv usage discussions with Kana, Claire, Juergen
  • Bjorn Emonts
  • Pam Ford
  • Enrique Garcia
  • Bob Garwood
    • CAS-10738 Ephemeris related changes. Remerged with master to regenerated branch tarballs. Bryan is expected to be testing this now. A series of related ephemeris tickets are being checked. It would be nice to include this change in 5.3.
    • CAS-10386 Add shared ASDM code to support two new tables. This is done and merged into master.
    • CAS-10550 asdmsummary documentation added to plone. Also minor changes to inline documentation. Done, merged, and waiting on documentation validation.
    • CAS-10693 Remove boost dependency. A badly edited file was found by ALMA during checking of the corresponding ticket on the ALMA side. Those changes have now been accepted and merged into the ALMA code. I expect these changes can be made to the CASA code shortly after 5.4 development starts.
    • Investigated reports from TelCall of a possible bug in the shared ASDM code. I can't reproduce and I suspect this is an issue with how TelCal is using the shared code.
    • Spent some time tracking down and looking at the SDFITS to MS conversion that was part of ASAP and was not migrated to CASA when ASAP was removed. A few GBT and other single dish users were surprised by this loss. It's not yet clear if there's enough GBT interest in reviving this functionality. It's unclear how much of that old code could be reused as it appears to involve a translation through the ASAP flat-table structure, so it was not a direct SDFITS->MS conversion. If this functionality is needed it may be better to start from scratch - which is a fairly substantial job and so may be hard to justify at this point.
  • Kumar Golap
  • David Mehringer
  • George Moellenbrock
  • Dirk Petry
  • Martin Pokorny
  • Federico M Pouzols
    • CAS-10937 - "MPI CASA: tclean creates new directory and does not parse the dirname correctly" - fixed small hidden issue in output path handling in tclean (parallel helpers).
    • doc/discussion ticket as a follow-up to CAS-11269 - "mstransform should not create WEIGHT/SIGMA_SPECTRUM in output MS if they're not available from the input MS and usewtspectrum=False"...
    • CAS-11301 "Creation and initialization of WEIGHT/SIGMA_SPECTRUM columns in various tasks, and intended use in pipeline, user scripts, etc"
  • Urvashi Rao
  • Darrell Schiebel
  • Kanako Sugimoto
  • Ville Suoranta
  • Takahiro Tsutsumi
Friday NAOJ Meeting
  • Wataru Kawasaki
  • Takeshi Nakazato
  • Renaud Miel

-- TakeshiNakazato - 2018-04-11
Topic revision: r9 - 2018-04-16, BobGarwood
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback