Thursday Morning Meeting, 17 August 2017

  • DIAL-IN NUMBERS & PASSCODES:
  • IP: 192.33.117.12##8110
  • Phone: (434) 817-6524

Attendance

  • Socorro:
  • CV:
  • Garching:
  • SCO:

News / Meetings / Visitors

  • New CASA Lead - Ryan Raba, starts 9/25
  • Continuing to interview for CARTA position
  • VLASS readiness assessment yesterday, CASA and pipeline on track

Build, Release

  • Pull requests:
    • Pull requests should be created only after verification/validation testing is complete (i.e. the ticket is Resolved).
    • A pull request automatically launches Branch Test Suite 3. Before making a pull request, please check the commits to see when master was last merged into the branch. If the branch is very out of date, Branch Test Suite 3 will probably fail.
    • In general, Bitbucket branches will be deleted after a pull request is merged, unless the developer asks to keep it. The branch will remain in the local repo until deleted (git branch -d branch).

  • Planning for 5.2/5.3/5.4 or 5.1.1/5.2/5.3 : Discussions are still ongoing as to whether we'll need anything non-standard or not on the Dec 2017 timescale. If needed, we'll call it either 5.1.1 or 5.2 depending on what goes into it. Either way, it will be managed as a separate branch so that all other development (unrelated to this) can proceed normally on master.

  • Release Logistics and Semantics ( Notes from Darrell )
    • patch -- a patch is a change which fixes bugs which do not have a systematic effect. A CASA stakeholder who updates from CASA 5.1 to CASA 5.1.1 should be able to assume that the results they obtain will not differ significantly and the user experience will have no significant changes. The patch only fixes bugs which are "spot fixes" in the view of those who understand the relevant code within CASA. These fixes are expected to not affect all users of CASA. Users can assume that updating to a newer patch version does not require revalidation.
    • release -- a release contains new features or bug fixes which significantly affect the behavior of CASA in some way for some class of CASA users. When users update to a new release, they should assume that they must validate their use of CASA.
    • As CASA becomes a production system, it is important that this distinction be clear. The "patch" fix for ALMA cannot be allowed to introduce a "bug" for VLASS and vice versa. A "patch" puts the onus of consistency validation upon the CASA group whereas a "release" puts the onus of consistency validation upon the users of CASA. However, this distinction is not about shifting blame, but rather it is about a consistent understanding of the implications of a particular distribution of CASA.
    • Using these definitions, the imaging changes which seem likely to be required for parallel execution of the pipeline lie outside of what can be considered a "patch". To accommodate parallel development, we will work on CASA 5.2 and 5.3 concurrently:
      • The CASA 5.2 branch will contain the December release candidate. This branch will created soon after CASA 5.1 is released. Developers that are working on features for the 5.2 HPC release (in December) will merge the changes to both the master branch and the CASA 5.2 branch.
      • The master branch will include the CASA 5.2 changes but will also include any changes for CASA 5.3 which will be released in the spring/summer.
      • The CASA 5.1 branch can be used for any "spot fixes", i.e. patch changes, which could be provided as CASA 5.1.1. However, we do not anticipate any patches.
    • The situation is still somewhat fluid, so it could still be that we decide not to have an extra release of CASA in December.

Verification Testing

  • Farewell Akeem!
    • Thanks for all his hard work, and dedication to the test these last 3 years
    • His last day is Friday
  • CAS-10481 : rms inconsistencies between serial/parallel runs of the ALMA pipeline. Need to re-run and re-evaluate after all the recent data-selection / csys creation bug fixes. Bjorn/Andy - any further insights ?
    • Should be rerun using the tarball created by CAS-10434
  • Working with Bjorn on scripting benchmarking on serial vs. parallel in CASA
    • Discussion on where to put the data, what machine can be used to run the scripts, and a plan to make that happen
  • Test team is MIA next week - Andy in Chile (ALMA IRM meetings and training), Akeem is gone, and Puimek is on vacation

Validation Testing

  • Current validation tally
    • 123 Under Validation, 77 Ready to Validate.
    • 17 tickets went RtV this week so far. 42 tickets went through Validation to Resolved this week so far.
    • Aug 4-10: 24 tickets assigned for validation. 30 tickets resolved.
  • Jen is unavailable 8/18-8/25 (organizing a meeting, internet access likely to be unreliable). Go outside and (safely) watch the eclipse!
  • Current Testing Efforts for 5.1 deliverables
    • autoboxing. I believe everything has been merged already.
    • mosaic issues. I believe everything has been merged already; the slowdown is real, due to fixing a 4.7.2 bug in channel chunking (CAS-10317).
    • statwt2
      • 5.2 testing ticket is CAS-10530
    • parallelization
      • HPC team has switched over to running serial tests to check serial results given several new parallel fixes
      • early results (Federico on CAS-10538) indicate serial run completes without errors; how do the images look?
    • miscellaneous
    • plone
      • complete tasks in 5.2, complete tools in 5.3
      • 5.2 task deprecation ticket CAS-10595
  • Upcoming wish list for big ticket testing items in 5.2 release
    • statwt2, systematic tests of mstransform, parallelization, plone task validation.
    • anything else?

Architecture

HPC

  • Status, what is running, what and how failed: https://open-confluence.nrao.edu/display/CASA/ALMA+Pipeline+Cycle+5+Testing
  • Note/warning: results in the weblogs and tables were generated with a wide range of prereleases (5.1.0-34...5.1.0-58 and a number of specific bugfix branches, all last month of development, with many fixes and fix-fixes in between, At the moment, trying to get some fresh reruns of selected datasets for validation / Bjorn and Andy ( CAS-10481)
  • cvpost065-068 machines being upgraded (infiniband net for all + 64->256 GB for 2 machines)
  • After many fixes went into master, for parallel mode only this test-blocking issue is currently open: CAS-10538. Not sure how often it happens, and could not test much because most if not all parallel runs have also been blocked by CAS-10536.
  • I looks like current master is good for serial mode. But we have sparse runs/evidence.
  • For test serial runs next weeks, should we try to parallelize the VLASS way, 2 or more serial runs on the same machine? For those serial runs where find_cont doesn't use ~150 GB it should be possible to run at least 2 serial test in parallel on the cvpost065-68 machines. Any experiences with this?

Development

  • Autoboxing : Nearly all edits have merged to master. Tak - status on the last one ?

  • Imager_BugFixes_5.1 : 25 issues in all. 4 are left. (Included in this list are HPC specific bugs/fixes : 2 out of 8 are left ). Need not hold the 5.1 branch for any of these.
    • CAS-10451 : json parser fix : The fix fixes the problem (validation), but some hiccups in verifying that the incoming changes from casacore didn't break something else. Martin ?
    • CAS-10538 : KG is looking at a fix for the parallel run failure. Federico has run it in serial to make sure it is OK for 5.1 even if a fix does not go in. If KG fixes it in the next week or so, we'll decide then whether to include it in 5.1 or have it only on master.
    • CAS-10317 : Mosaic slowdown. VLA case : being worked on. ALMA case : all is as expected : the 5.1 run took longer because it did more work (which 4.7.2 failed to do because of a bug).
    • CAS-10264 : Need to fix a log message. Very minor. Will put in on master as well as 5.1 in the next few days.
    • A few tickets (~3) moved to 5.2.

  • New bugs : CAS-10525 : tclean ignoring negative flux - fixed/merged. CAS-???? : residual image scaling isn't what it should be with mosaics (KG looking) - probably 5.2 issue.

  • CrashReporter : CAS-10524 : Ville ?
  • Deprecation warnings.
  • Anything else ?

  • 5.2 development
    • Please continue to add in your JIRA tickets with fixVersion='casa 5.2'.
    • CAS-10595 has been made to track what tasks need to begin the deprecation process in 5.2 (or to be labeled as 'uses old code / will replace'.

Pipeline

  • Preparations for creating the C5P2 branch scheduled for Monday August 21
    • Development "frozen"
    • Draft release note are here https://wikis.alma.cl/bin/view/DSO/PipelineHeuristicsReleases2017
    • Ongoing ALMA pipeline testing at JAO, NAASC
      • Last minute autoboxing / image size mitigation related imaging tickets are under validation
      • Major issue with ALMA quasar catalog service / flux.csv file have been resolved
      • Several small web log improvement are in progress
    • On going VLA pipeline testing at DSOC
      • Small web log fixes
  • Some pipeline team members have other work commitments and limited availability for next few weeks

AOB

  • Many of these warnings have appeared recently:
    • In pipeline tasks such as pipeline.infrastructure.tablereader::ms, pipeline.hif.tasks.rawflagchans.rawflagchans::ms, pipeline.hif.tasks.correctedampflag.correctedampflag::ms
      • The use of ms::iterorigin() is deprecated and will be replaced by iterorigin2() in a future version. After deprecation, iterorigin2() will be renamed iterorigin().
      • The use of ms::getdata() is deprecated and will be replaced by getdata2() in a future version. After deprecation, getdata2() will be renamed getdata().
    • And also in CASA.
      • flagcmd::ms::range The use of ms::range() is deprecated and will be replaced by range2() in a future version. After deprecation, range2() will be renamed range().
      • task_setjy: ms::nrow
    • Will it be a backwards compatible replacement or do we need to worry about this?
    • ms tool warnings will be removed from 5.1: CAS-10597. New functions have same signature and results as old ones

Developer Reports

Thursday Meeting
  • Sanjay Bhatnagar
  • Sandra Castro
  • Lindsey Davis
    • Drafted pipeline C5P2 release notes
    • hifa_timgaincal tasks web log improvements
    • Followup on SCOPS pipeline testing tickets
    • HPC ticket monitoring and followup
  • Bjorn Emonts
  • Pam Ford
    • pull requests merged for smallish plotms bug fixes: popups (CAS-10537), uv-related axis seg fault (CAS-10534)
    • created cal tables for plotcal->plotms testing, started on reference antenna selection issue (CAS-7049)
  • Enrique Garcia
    • CAS-5174 - UV Continuum Subtraction TVI
    • CAS-10211 - improve the approximations for effective bandwidth and effective resolution in split/mstransform
    • CAS-10013 - StatWt Rework: Add support to mstransform
  • Bob Garwood
    • CAS-10278 - Synchronize shared ALMA/CASA code and eliminate compiler warnings. Trying to help with suitable data to use to test impotasdm -> exportasdm to make sure nothing has changed.
    • Some tests related to the asdm move to a separate package.
    • Learning java.
  • Kumar Golap
  • Jeff Kern
  • David Mehringer
  • George Moellenbrock
  • Dirk Petry
  • Martin Pokorny
    • Completed casacore changes to MSIter for asynchronous VI2
    • Updated casacore submodule reference in CAS-10451.
  • Federico M Pouzols
  • Urvashi Rao
    • Fixed CAS-10525 : cleaning negative flux.
    • Testing for CAS-10451 : json fixes.
    • Ongoing discussions with various parties about whether a 5.1.1 patch or something else on a Dec 2017 timescale is required and how we will handle this.
    • Many other discussions and JIRA patroling.
  • Darrell Schiebel
  • Ville Suoranta
  • Takahiro Tsutsumi
    • Took a look at CAS-10481. Some of the results seem to be affected by the CAS-10434. Probably worth while to rerun with the latest casa version.
    • Verified that one of the fixes done by Kumar and Sanjay or CAS-10434, in imager_parallel_cube.py was correct (probably it was a bug introduced by me a while ago…).
    • CAS-10250: retried to reprodue the issue with current pre-release. Also noticed that in the original post, it was using restoringbeam=‘common’ where chan0 psf was a quite large than those for the rest of channels.
    • CAS-10462: Worked on to resolve the conflict with master. Now it has been merged.
Friday NAOJ Meeting
  • Kanako Sugimoto
  • Wataru Kawasaki
  • Masaya Kuniyoshi
  • Takeshi Nakazato
  • Renaud Miel
-- PamFord - 2017-08-10
Topic revision: r16 - 2017-08-17, PamFord
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback