-- SandraCastro - 2015-02-18

Casa HPC/Parallelization meeting agenda/minutes


Thursday [19/02/2015], [ESO Centaurus, C.2.01], [9:00 AM MST]


How to connect

  • Dial in: +49 89 6834
  • Video connection: Use this one (46104@134.171.42.27). Other option is 46104@eso.org

Attendees:

ESO:

Socorro:

CV:

Agenda

  1. Balanced mode in partition and mstransform (Sandra)
  2. Integration of MPI Server ranks in the logger at the C++ level (Sandra)
  3. Status of mpicasa in the binaries. (Julian, Darrell)
  4. Pipeline testing (Sandra, James, Lindsey)
  5. Scalability tests (Justo)
  6. Imaging testing (Sanjay, Justo)
  7. AOB

Minutes

1. Balanced mode in partition

  • Sandra informed that partition and mstransformÂ’s default separationaxis is still called 'auto' but uses the balanced mode approach.

2. Integration of MPI Server ranks in the logger at the C++ level

  • Sandra informed that this implementation is almost done, as explained in CAS-6705.

3. Status of mpicasa in the binaries

  • Julian needs help from Darrell to change the layout of libraries in the binaries. Mark will talk to Darrell about this and report back.

4. Pipeline testing

  • Sandra run the pipeline successfully in the cluster (interactive node only). Lindsey compared the logs and results with a sequential run and it all seems fine. There were known failures in plotms, due to the display setting. Julian gave some suggestions to bypass this by calling xvfb -1 before casapy.
  • Sandra should now try the new pipeline recipe which includes calls to the ms tool.
  • James run of the pipeline in Socorro also reported good I/O performance and no visible problems. He noticed though that applycal uses a lot of memory.

3. Scalability tests

  • Justo run a importasdm+partition test on 96 servers, which took about an hour to finish.

3. Imaging

  • Sanjay run tests with 16 processes with 2 nodes each. He had problems due to SEVERE message that first seemed to be timeouts, but resolved not to be. He suggested some changes in the heart-beats implementations of the MPI framework.
  • Lindsey suggested that Sanjay could first test on a small image to catch systematic problems to later test on bigger images.

Action Item List

  • Mark to talk to Darrell about giving Julian info/access to working with the binaries.
  • Andy to ask Darrell about including test_mpi4casa in the smoke checks.
  • Julian to send an email showing how to use the hostfile with mpirun (mpicasa) to the test team.

Topic revision: r4 - 2015-02-25, SandraCastro
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback