Thursday Morning Meeting - 31 August 2017

  • DIAL-IN NUMBERS & PASSCODES:
  • IP: 192.33.117.12##8110
  • Phone: (434) 817-6524

Attendance

  • Socorro:
  • CV:
  • Garching:
  • SCO:

News / Meetings / Visitors

  • Recruiting progress
  • Correlator upgrade - capturing potential CASA work
  • Travel planning (FY18)

Build, Release

  • Info about what 5.1, 5.2, 5.3, 5.4 are + timescales + how prereleases will be handled, etc. : https://open-confluence.nrao.edu/display/CASA/Release+Planning ( This page is currently incomplete, but Anand made this just before he left, and we will edit it and fill things in )
  • Pull requests:
    • Wait until the last packaging build is done for the ticket branch before issuing a pull requests. The pull request testing is done with the most recent package and will fail if there is no package available.
    • Some of the changes don't require a full test suite run. Examples are test additions, typo fixes and such. If you skip the packaging step intentionally, make a note of that in the pull request.

Verification Testing

  • HPC pipeline testing
    • Looking at listobs from any of the runs, SpwID in serial exceeds in the sources list what is found in the Spectral Windows section
    • e.g. parallel X807f and serial X807f
Spectral Windows:  (25 unique spectral windows and 2 unique polarization setups)
.
.
.
Sources: 280
  ID   Name                SpwId RestFreq(MHz)  SysVel(km/s) 
  0    J0224+0659          0     -              -            
  0    J0224+0659          25    -              -            
  0    J0224+0659          26    -              -            
  0    J0224+0659          27    -              -            
  0    J0224+0659          28    -              -            
  • Any ideas why serial includes these "phantom" spwids?
  • Is the truncation seen in parallel expected and OK?

Validation Testing

  • Current validation tally
    • 118 Under Validation, 82 Ready to Validate.
    • 24 tickets went RtV this week so far. 24 tickets went through Validation to Resolved this week so far.
    • Aug 11-17: 30 tickets assigned for validation. 50 tickets resolved.
    • Aug 18-24: 9 tickets assigned for validation. 12 tickets resolved.
  • Current Testing Efforts for 5.1 deliverables
    • not much. PL to be merged in the coming days, final package testing to be done by Juergen and me by midweek next week.
  • Current Testing Efforts for 5.2 deliverables (specific to HPC, ALMA pipeline specific)
    • many tickets closed in recent days with fixes in 5.1
    • coordination for the next level of science validation testing between PLWG/HPC group/Bjorn/(user testers as needed) is on the horizon
  • Current Testing Efforts for 5.3 deliverables
    • statwt2 (CAS-10530)
      • VLA testing scheduled early in the development cycle
    • systematic tests of mstransform
      • is this CAS-5174 (or something else)?
    • plone/CASAdocs
      • complete tasks and incomplete chapter pages in 5.3, complete tools in 5.4
      • Bjorn will be taking a bigger role in getting unfinished docs completed
      • 5.3 task deprecation ticket: CAS-10595
    • further parallelization testing (non-ALMA pipeline specific)
    • polarization calibration
    • miscellaneous

Architecture

HPC

  • Current prereleases (-61 to -67) tests: https://open-confluence.nrao.edu/display/CASA/Tests+with+%3E%3D+prerelease-5.1.0-61
    • Serial tests: all good, only 3 very long ones still running without issues so far.
    • Parallel tests re-started. Some issues are arising but going much more smoothly now with prereleases-6x.
    • Some 10 tests running very recent prerelase-5.1.0-65/-67 (Pipeline-CASA51-P2-B.40759...Pipeline-CASA51-P2-B.40774) without issues.
  • Big slowdown and speedup factors from changes introduced throughout August. Examples of up to 50% speedup or up to 100% slowdown. Next week we should have a more complete picture of changes between older (https://open-confluence.nrao.edu/display/CASA/Tests+before+prerelease+5.1.0-59) and more recent runs (https://open-confluence.nrao.edu/display/CASA/Tests+with+%3E%3D+prerelease-5.1.0-61).
  • Issues that can be closed (pipeline issues with tickets formally open but actually fixed in the pipeline before 5.1):
    • CAS-10431 - ALMA pipeline fails in imageprecheck.py with "No spectral window with ID '25' found"
    • CAS-10614 - pipeline.infrastructure.tablereader::msmetadata_cmpt.cc::transitions Exception Reported: Exception: SOURCE table does not contain a row with SOURCE_ID=2 and SPECTRAL_WINDOW_ID=0
  • Also this remaining "Imager_BugFixes_5.1" issue is fixed:
    • CAS-10434 - Error "SelectData has to be run before defineImage" in parallel tclean for an ALMA pipeline test dataset
  • Horizon for parallel imaging for 5.2: looks sunny, with ~15 tests in parallel finished or ongoing only these two issues arised: CAS-10662, CAS-10672
  • Parallel vs. serial validation. Can we conclude that https://open-jira.nrao.edu/browse/CAS-10481isnowfixed and there are no other related issues open?
    • Weblogs (serial + parallel) now available for some of the top test datasets which were failing before, like T007 and T010.

Development

  • CASA 5.1
    • All pre-planned work is in place.
    • A few new issues have come up that have been deflected away from the imminent 5.1 release.
      • CAS-10664 : tclean's mtmfs doesn't write 'OBJECT' name in the header. Should go into early 5.3 (and 5.2 if it matters that much)
      • CAS-10644 : Filler needs to detect duplicate rows in the ASDMs. This was noticed in high time resolution data (ALMA/Solar SDimaging) but the ICT put in a temporary fix. They have a more permanent spot fix, but we (CASA) still need to also be able to handle this in a more generic way for CASA 5.3.
      • Support of Solar TCals in the SDM and MS : The VLA online system added a detail at the last minute that CASA can't yet handle. They're working around it for now, but they need it on a 'couple of months' timescale. Discussion is ongoing about what/where the correct fix should be. Once this is determined, we need to discuss if this is something that will require a 5.1.1 or if they can simply use 5.2 if that is also to be produced on a similar timescale.

  • CASA 5.2
    • Initial discussion with Remy/Lindsey on coordinating the next round of ALMA pipeline parallel tests
      • We will use the 5.1 release to repeat serial/parallel tests
      • For the rest of the tests, we need not do the work of parallelizing the entire 'pipeline environment', but we should do the following
        • Run a few-line script per test to generate antenna-position.csv and fluxscale.csv files on disk, before re-running the pipelines. This will allow us to skip the lengthy ASDM downloads, etc, but still give us the required numerical accuracy.
        • Re-use some analysisUtils methods to automate comparisons of pipeline outputs (Federico asked for this, and Remy already has code for this)
      • Pick one test dataset to prototype running with the 'PPR' (pipeline environment). This is to provide input to whoever is going to do the actual code for parallelizing the ALMA pipeline setup (not Lindsey's group, not Remy's group). Federico has already done some work on this (right?).
    • We will meet in the next month for further coordination. Lindsey, Remy and I are proposing that we (CASA) stop after the above and hand things over to the PLWG for their validation and parameter tweaks/tests. After parallelized pipeline scripts have been made (by someone else), we can step back in at the end to assist in just running it all systematically with the kinds of monitoring metrics Federico/Sandra have been gathering. Does this sound OK to the HPC group ?

  • CASA 5.3
    • I presented the current plan to the stakeholders group yesterday. No big surprises for anyone, but the VLA did give me some new items only now, and I got some responses to questions I had for them. I made them aware of timescale on which we need testing feedback and detailed requirements from them (in order to stick with our schedule). I will update the planning doc and post the final version to the release planning confluence page.
    • Is there anything we can do for 5.3 w.r.to bug fixes of our current viewer ? (The future of the viewer/CARTA and timescales were also a big concern, but that's not something we can do much about right now. Something for the new CASA Lead to deal with.)
    • Please make your JIRA tickets for 5.3 (if you haven't already done so). I'm getting close to when I will need to find (or make) them and connect them to the planning document. Parent tickets (just for tracking) will suffice. Thanks !

  • Automated testing of Imaging should be part of one of these cycles with high priority.
    • There is a lack of 2e2 regressions running with tclean. Most of them still run with clean
    • There is a vicious cycle lately of problems being fixed for parallel issues that break the serial mode and vice-versa. These cases need to be caught beforehand using: C++ unit tests and functional tests of tclean.
    • Several of the recent bug fixes in tclean can already become a test. Are they being added to the functional test list?

  • Is there any plan to add tests to the pipeline? One could start with high-level tests of the main stages and move down progressively. I believe Vincent has started to implement this last year, but the priorities shifted and the work didn't continue (?!)

Pipeline

AOB

Developer Reports

Thursday Meeting
  • Sanjay Bhatnagar
  • Sandra Castro
    • Catching up on > 1000 emails to read
    • HPC-related work
  • Lindsey Davis
  • Bjorn Emonts
  • Pam Ford
  • Enrique Garcia
  • Bob Garwood
    • CAS-10621 Target in image appear at incorrect coordinates. This may be an issue with the Ephemeris table. Investigating how that table gets interpreted by the filler. * Discussions on Tcals and VLA solar data.
  • Kumar Golap
  • Jeff Kern
  • David Mehringer
  • George Moellenbrock
    • Designing better tests for calibration VisCals, anticipating improvements to polarization terms
    • Tcal discussions with EVLA folks
    • EOP discussions with Ed
    • Some "science" (well, processing some eclipse pictures in python at the CASA prompt!)
  • Dirk Petry
  • Martin Pokorny
    • Create casa-5.1.0 branch in casacore
    • Add commit for SMA_SYSTEM_TEMPERATURE table to casa-5.1.0 casacore branch
    • Continue implementation of asynchronous VI2
    • [Complete paper for SC17 re realfast@VLA]
    • [Watch total solar eclipse]
  • Federico M Pouzols
    • Finishing serial ALMA pipeline tests (for 5.1), and starting parallel ALMA pipeline tests (towards 5.2).
    • CAS-10662: preparing tests to try to reproduce and check if a bigger MPI buffer allocation would fix this or if it is something else.
    • Checking various mstransform/cvel issues, CAS-10446, CAS-10584, CAS-10051, CAS-9241.
  • Urvashi Rao
    • Emails/conversations and some code/validation work to push through the last couple of tickets for 5.1
    • Initial reading/thoughts about the impact of ALMA's proposed correlator upgrade on CASA.
  • Darrell Schiebel
  • Ville Suoranta
  • Takahiro Tsutsumi
    • setjy: worked on Perley-Butler 2017
Friday NAOJ Meeting
  • Kanako Sugimoto
  • Wataru Kawasaki
  • Masaya Kuniyoshi
  • Takeshi Nakazato
  • Renaud Miel

Minutes

  • 5.1 build was made on Wednesday. Getting ready to start packaging with pipeline from Fri evening or Tues morning. Monday is a holiday here
  • Verification : Identified an inconsistency between how MS meta data is handled between serial and parallel runs. This relates to now importasdm calls partition (for serial vs parallel). Sandra/Bob will follow up on this and make a ticket to address the problem at the source. Keep Lindsey in the loop as the pipeline may expect some of the untouched metadata.
  • Need to sort out uvcontsub, uvcontsub2, uvcontsub3 and the one in mstransform. Bjorn pointed out user confusion over which to use, especially since features aren't fully replicated in the new code yet.
  • HPC : Current tests are going OK. Need more discussion with PLWG on the next round of tests.
  • Dev : Solar TCals : This looks like it will be a simple fix in gencal (from George). We will likely need a 5.1.1 patch for this (and maybe CAS-10664 too). Morgan will iterate with the VLA online system to find out timescales.

-- MorganGriffith - 2017-08-15
Topic revision: r14 - 2017-08-31, UrvashiRV
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback