CASA Parallelization Meeting Minutes

Thursday November 6th, DSOC 317, 8:00AM MT

  • 192.33.117.12##8108
  • 434-817-6523

Attendees:

  • Socorro: Rob, James, Sanjay, Lindsey
  • Charlottesville: N/A
  • Garching: Sandra,

  • Apologies:
    • Justo is sick at home.

Discussion

  • Sandra taking CASA Parallelization sub-project lead, effective 11/20/14.
    • Rob will help setup working meeting for week of Dec 1st.

  • HPC working meeting - Dec 1 - Dec 5 @ DSOC
    • Reserved room 248 for week of 12/1. (3 desks)
    • Do we need workstations? (No)
      • Will bring laptops, but cannot run CASA.
    • Topics for the week?
      • Previously limited on testing environment. Do need some documentation for testers, how to set things up and how to test.
      • (1) Sanjay to put together a regression that others can run. Use to verify results. (2) James will modify to use openmip.
        • Can compare ipython and openmpi frameworks, along with other changes to the code base.
        • (3) Eventually setup a couple of VLA and ALMA data sets (golden data sets) for regression runs. Lindsey has proposed a couple of pipeline data sets. Sandra has a candidate too from ALMA Cycle II.
        • Long term, need to start validating based on pipeline performance. But these regression scripts are a good solution while development is proceeding on both CASA HPC and Pipeline.
        • Migration can be gradual. Use pipeline for calibration and flagging verification, but keep regression scripts for imaging end-to-end verification.
        • Desirable if the data sets have been used for serial testing, but more important that they are interesting on a 5year period.
          • Wideband and widefield imaging will have to be considered.
      • Sandra notes that split should be exercised since it is heavily used.
      • 3C147 test can be updated by James.
      • Testing of MSTransform MMS features
      • OUTCOME OF MEETING: Plan (for 6-12 months) testing platform.

  • From 4.3 release notes:
    • "plotms cannot plot asynchronously. If a plot is issued while a previous one is still ongoing, plotms will crash and CASA needs to be restarted. Please wait for each plotms job to be finished before issuing a new one. "
      • Not an issue for scripted use cases. Only applicable to interactive use with the GUI. When run w/o gui, execution is synchronous.
      • ~2:1 or 3:1 ratio for i/o:plotting per plot. Data loading is the majority of plotting time.
      • Should test scripted case of multiple processes running plotms on an MMS.

  • Development & Testing - Status updates
    • First time I run successfully a cycle 2 data reduction script using MMS, mstransform and mpi. The script performs the following steps: importasdm, flagging, wvrgcal, generation of Tsys and ant tables, applycal, split science spws out, plotms, flagmanager, bandpass and gaincal calibrations, applycal and a last split of the corrected column. I run the script without using the analysisUtilies packages which many times leave open tables in the cache which interfere with the parallel servers. These scripts always run in sequential.
    • We think there is an issue that happens randomly. There is a method in BaseTable that doesn't flush the sub-table and when mstransform tries to write a big sub-table such as the POINTING table, another subsequent call fails because the sub-table is not yet flushed in disk and fails. We need to run more tests to be sure about this.
    • James will broaden out his HPC testing to include various file systems. May expose issues.

  • Next Meeting
    • Week of 12/01/14

Deferred Items for a future meeting

10/16/14:
    • Multiple xterms
    • Schedule for first OpenMPI enabled test/stable package.

08/14/14:
  • Logger
    • Single log file & concerns regarding ordering of entries.
    • Pipeline think of log as a data product. Single file.
    • No synchronization at c++ level amongst loggers, but no indication of log overlap in current implementation. Can happen with intense logging activity.
    • Improvements may not be urgent, revisit use cases/requirements when Sandra is getting ready to start work.

  • Imaging
    • Discuss requirement imposed on VI/VB2 to pass required information for MMS processing.

  • MSTransform
    • Discuss time separation axis for MMS creation.

Action Item List

Item # Date Opened Description Leads Status Status Notes
11 9/11/14 Evaluate and characterize file descriptor limit issue in imager James Open 9/11/14: Task for next week.
12 9/11/14 Basic test of mpi4casa on OSX 10.8 to characterize any multi-platform problems. Justo Open 9/11/14: Don't need mpi to work, but just ensure changes we are considering do not create blockers for OSX builds.
13 9/25/14 Kumar to fix a bug in virtual model column not cleared at times Kumar Open  
15 9/25/14 Jim J to fix the open all files of all the columns after the release Jim Open  

Closed Item Record

Item # Date Opened Description Leads Status Status Notes
01 8/07/14 Update Issue Chart to reflect current status. James Closed 8/14/14: Contributions from Kumar incorporated. Justo provided separate notes on tasks outstanding. 8/7/14: Contributions from others on the team also welcome.
02 8/07/14 Review available documentation (on wiki), in particular the MPI document. Lindsey Closed 11/05/14: Closed. 9/11/14: Useful to complete before pipeline meeting. 8/21/14: Initial read, but still has questions.
03 8/07/14 Update reference doc links to point to MPI doc in SVN Rob Closed 8/7/14: Complete.
04 8/14/14 Talk to James, Kumar, Justo and others and bring some resolution to preferred MPI library/implementation issue. Rob Closed 8/21/14: made MPI Implementation page. will leave open until library selection finalized. 8/14/14: Have feedback from Justo, James, Kumar and Martin Pokorny. Will document and distribute.
05 8/14/14 Add new wiki pages for requirements capture, task list, and other project artifacts. Update based on recent meetings, then circulate for iteration by others. Rob Closed 8/21/14: see main page
06 8/14/14 Evaluate feasibility of completing cvel2 for 4.3 release. Sandra Done Committed to r31041, r31056 and 31057
07 8/21/14 Clarify MSTransform use cases with Jeff. Rob Closed  
08 8/21/14 Test mpi4casa integration on cluster. Justo Closed 10/14/14: Completed and reported to group. Documentation updated based on experience.
09 8/21/14 Review openmpi features relative to requirements for preferred library. Justo Closed. 10/14/14: Completed.
10 9/11/14 Documentation of MSTransform MMS functionality for users Sandra Closed 10/15/14: Completed. 9/11/14: Target of 10/20/14 (pipeline meeting).
14 9/25/14 Sanjay (with help of James R) will get a parallel run with boosted limit on file descriptors Sanjay, James Closed 11/05/14: Work around in place last month. 10/16/14: See notes from today's meeting.
Topic revision: r5 - 2014-11-06, RobSelina
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback