CASA Build and Test Group Meeting

Charlottesville: Tuesday, 2nd December, 2014, 18:00; Room ER331.

Socorro: Tuesday, 2nd December, 2014, 16:00; Room ?.

Japan: Wednesday, 3rd December, 2014, 08:00.

DIAL-IN NUMBERS & PASSCODES:

Note: In practice, these telecons are usually conducted via Google+ Hangouts/Voice (look for user: mgrawlings)

USA Toll Free Number: 866-901-8266

USA Toll Number: +1-203-566-3863

Participants

Present: Mark Rawlings, Darrell Schiebel, Shinnosuke Kawakami, Akeem Wells, Sandra MarĂ­a Castro, Justo Antonio Gonzales, Julian Taylor

Apologies: Andy Hale

Post-meeting edits and additions are in blue text.

Agenda

  • Discussion of MPI integration in the CASA package build process. This is the list of things needed in order to integrate MPI into the CASA build.
    • Prepare a rpm of OpenMPI, configured with thread safe option.
    • Prepare a rpm of mpi4py based on the CASA's Python and OpenMPI rpms.
    • Include boost::mpi in the boot rpm which also has to be based on the CASA's Python and OpenMPI rpms.
    • Upload all these rpms to the CASA yum repository.
    • Modify all CMakes to compile using the MPI compile wrappers.
    • Include the 'thread' SWIG option in the gcwrap CMakes.
    • For Mac OS X we will need to do something equivalent to the rpms, e.g. include these precompiled packages in the developer tarball.
    • There was a lot of discussion on this topic. Justo directed the B&T Group's attention to his (excellent) "Installation and Advance User Guide" document.
    • In the short-term, RPMs are probably the highest priority, plus tarballs. Mac OS X .dmg packages are regarded as a lower priority.
    • It was agreed that Julian would work closely with the B&T Group on MPI integration.
    • The primary topic of discussion was the reliance on the Boost library/framework. By having CASA adopt C++11 conventions in the future, the B&T Group had been aiming to remove CASA's dependency on Boost altogether, in order to reduce the size and complexity of the final codebase (Boost is large, and we are currently only using a small fraction of its capabilities). However, it became clear during the discussion that this aim had not been sufficiently widely communicated and agreed upon within the project, as the MPI team had assumed continued use of Boost functionality: so far, it is still an assumed dependency for the ASDM system, and the MPI effort (boost::mpi).
    • The MPI work currently assumes the use of Boost v. 1.41.0 (or presumably higher). The boost::mpi module used is not included by default, and so is compiled so that it works with the other system-provided components.
    • The default RHEL distribution of OpenMPI does not work for our purposes, and is replaced as part of installation process.
    • CASA MPI has not been tested on OS X yet (this will happen).
    • Question: Would it be acceptable to the CASA management to ship CASA for Linux with MPI, but for OS X without it? (Foreseen possible objections: Users might complain; they still might want to use a Mac as a controller system).
    • Sandra also asked Shinnosuke about the number of files opened when an MMS is accessed. The relevant ticket is CAS-4860. The testing is dependent on the casacore unification project (see also below), the progress of which is tracked in CAS-6929).
  • Status of current CASA packages
    • Release - 4.2.2 out for OS X (10.8) and RHEL 5 and 6. 4.2.2 release has been successfully installed in Japan.
    • 4.3 pre-release tarball packages made available to testers in Chile and Japan. Installed on testing machines in Charlottesville and Socorro. 4.3 pre-release for OS X subsequently made available to testers in Charlottesville.
    • Two remaining 4.3 release blocker items
  • Staffing
    • New CASA Project Manager starting this week.
    • Shortlist of applications for B&T engineer assembled. Three interview candidates for this position has been established. No news since last meeting. Onsite interviews likely to take place in January.
  • Build and Test Review Status
    • No news. Review timeline still effectively on hold, pending above decisions regarding B&T hiring strategy.
  • Casacore unification - Being worked on now. Jim asked if Shinnosuke is getting enough time to work on this from NAOJ or maybe they can let him go full-time on this and catch-up in the future after we're past this? Shinnosuke feels that he is already making progress with Ger on this, and cannot defer his other NAOJ responsibilities.For reference, Jim's proposed plan was as follows:
    • 0.5. Attempt to build against the GC casacore and identify problems. DONE. Shinnosuke has effectively completed this task.
    • 0.75. Iteratively fix bugs and attempt to build again. This is Shinnosuke's B&T primary assignment for the short-term. Ongoing. Ger has been working with Shinnosuke on this. Shinnosuke has been working on gcwrap. Discussions with Ger are ongoing (see Jira ticket https://bugs.nrao.edu/browse/CAS-6929 for details). Jim reported yesterday that the Google Code people are keen for the merge to proceed. PROGRESSING. Issues encountered this week include shared_ptr errors. Dave Mehringer has also been helping out with this effort recently, in his capacity as the ImageAnalysis expert. Jim has also been contributing. It currently seems unlikely that this step will be completed by the end of the year (NAOJ is hosting the international ALMA meeting, a post-doc symposium, etc. this month).
    • 1. Merge our Casacore into GC casacore. This will require us to flush all casacore mods in and then let Ger do the final merge (finally).
    • 2. Build CASA against GC casacore (platforms?)
    • 3. Test CASA/GC casacore against our various regression tests, etc.
    • 4. Create a Jenkins job that will handle B+T of codebase using GC casacore.
    • 5. Keep a cached, readonly copy of GC casacore locally(?) to prevent a glitch at googlecode from allowing us to build. Also we should maintain periodic a backup copies.
    • 6. Go live and live happily ever after. wink
  • B&T news from Charlottesville.
    • Most of the new package production system is now in place. Already pushed out to CASA developers nodes in Socorro. Holding back on data reduction cluster deployment in CV - probably to be deployed post-4.3 release now.
    • Mark has written up the current and planned structures available so far, including version numbering. Iterated on documentation with Socorro (Rob) and EA (Kana). Still to write up: the process "behind the scenes"...
    • The 4.3 release branch is now pretty mature. Developers should continue to check in all subsequent 4.3 bug fixes to both trunk and release.
    • Updates to the launcher wrapper made available, supporting launch of the pre-release packages. User testers now have access to the pre-release package for 4.3 bug fix testing, including an OS X 10.8 package. OS X packaging automation is a work in progress by Darrell. He is currently experimenting with doing this using Gradle and Groovy. Some changes have been made regarding initial linking on OS X.
    • Darrell has (mostly) successfully run the first full set of automated tests under Mac OS X. Post-meeting note: it took 15 hours to complete (c.f. ~7 hours under Linux). More computing horsepower (and more efficient tests) may be needed.
    • Two new Macs are currently being set up for Andy Hale's testing group.
  • B&T news from EA.
    • Many/most of the CASA 4.3 Single Dish jobs that had been done have been merged into the release branch now. Lots of testing has been done recently.
    • Agreement with Mark made that Bunyo can also resolve EA user testing tickets as appropriate (Dirk already sometimes does this for Europe).
    • EA have some concerns about the feasibility of supporting older OS X versions (Mavericks) - hardware will be purchased this month.
    • Shinnosuke has been working on the casacore unification issue (also see above).
  • Old Action Items
    • Kana: Send notes to Darrell on how to retrieve and build libsakura. This will be attempted in isolation on a test machine in Charlottesville, to allow Darrell to assess its implications for RPM-based installations, developer machine environments, etc. DONE. Darrell will get back to her on this.
    • Andy: Discuss Jenkins test for check-ins on OS X with Alexis: same (or different?) smoke tests for OS X as for Linux? Related: machines in Socorro for this? (It was noted during the meeting that this AI is not a pressing issue at this point).
  • Next meeting: December 16th, 2014.
  • Any Other Business
    • None.

New Action Items Arising

  • Action Item on the B&T Group: Review Justo's document and come up with a plan forward before the end of this week.
  • Action Item on Mark: Raise the Boost issue with the relevant CASA managers: do we want to work towards getting rid of Boost in CASA, or embrace it? In the past, it had been indicated that when we switch to C++11 the HPC/MPI C++ code could transition to C++11 and that Boost MPI could be phased out. Justo feels that this is not the case, but Darrell thinks it could probably still be avoided.Adopting Boost MPI as the basis for our C++ parallelization would effectively mean adopting Boost permanently (in which case, we might as well embrace it?). Darrell advocates a standalone test program to establish whether or not C++11 alone can do the job. DONE
  • Action Item on Mark: Raise the following question with the relevant CASA managers: Would it be acceptable to the CASA management to ship CASA for Linux with MPI, but for Mac OS X without it? (Foreseen possible objections: Users might be confused/complain; they still might want to use a Mac as a controller system). DONE
  • Action Item on Darrell (with Justo): Review and discuss further the Boost dependencies. ONGOING.

-- MarkRawlings - 2014-12-01
Topic revision: r6 - 2014-12-03, MarkRawlings
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback