Thursday Morning Meeting - 26 October 2017

  • DIAL-IN NUMBERS & PASSCODES:
  • IP: 192.33.117.12##8110
  • Phone: (434) 817-6524

Attendance

  • Socorro:
  • CV:
  • Garching:
  • SCO:

News / Meetings / Visitors

Users

  • CASA users discussion group India, GRMT & VLA (Ruta Kale, NCRA Pune, + 25 others):
    • Tasks "gaincal" and "bandpass": statistics to find solutions is an issue. Statistical method for the solution uses "mean" in CASA, while "robust" or other better methods in AIPS. Request to replicate the statistical methods from AIPS.
    • Task "plotms": need an easy method to plot stokes "IQUV" visibilities. At the moment, stokes parameters have to be calculated manually outside of CASA before being plotted.
  • Helpdesk JIRA tickets (attachment bottom)

Build, Release

  • Casacore won't build on XCode9 + OSX 10.12 due to a cmake bug. XCode9 + OS X High Sierra seems fine. Will test updating to Cmake 3.10 as soon as the new Macports port is available.
  • Testing new pull request build. This should simplify the merge process by pushing all of the merges through the same test pipeline. For now the process PR should be as before -> Wait for the prior packaging builds to complete and and issue a PR.

Verification Testing

  • CASA 5.2 HPC pipeline testing results https://open-confluence.nrao.edu/pages/viewpage.action?spaceKey=CASA&title=Tests+for+CASA+5.2
    • Focusing on imaging (steps 33 and 35 - hif_makeimages) Line Free moment 0 and moment 8 looks like parallel is not applying a mask
    • Parallel Moment 0:
      uid   A002 Xc24c3f Xcd.s35 0.NGC1068 sci.spw25.repBW.I.iter1.image.mom0 fc.sky.parallel.png
    • Serial moment 0:
      uid   A002 Xc24c3f Xcd.s35 0.NGC1068 sci.spw25.repBW.I.iter1.image.mom0 fc.sky.serial.png
    • seems the case for all data sets I looked at
    • For the dataset E2E5.1.00007.S_2017_10_10T17_30_27.593 (T045)
      • For parallel runs step 33 and 35 both have this warning: Warning! No automatic clean mask was found despite clean residual peak / scaled MAD > 10, check the results. Field NGC1068 Intent TARGET SPW 25
      • No such problem in serial
  • CAS-10844 has been put on hold
    • Seems to be a Nose issue, and requires the bug in Nose be fixed
    • the --all option in runUnitTest will not work in the meantime
    • Could Darrell give it a look?
  • Updated the testing for 5 CASA guides
    • Once OSX platform testing is confirmed to work, they will be put in to the framework
    • 3 topical VLA, 2 ATCA
  • the runUnitTest runRegressionTest merge work continues
    • Unit tests work in the new script
    • Some regression tests work
      • The issue is no standard regression test format
      • Working on a new regression test template
      • Some existing regression tests will need to be updated
  • Compared untar times for master packages (5.3, 5.2, 5.1.1)
    • It seemed like recent tarballs were unpacking slowly
    • Testing confirmed it was transient and possibly tied to a cluster node (testing was done on a local linux desktop)

Validation Testing

  • Current validation tally
    • 120 Under Validation, 85 Ready to Validate.
    • 3 tickets went RtV this week so far. 4 tickets went through Validation to Resolved this week so far.
    • Oct 13 - 19: 4 tickets assigned for validation. 6 tickets resolved.
  • Current Testing Efforts for 5.2 deliverables
  • Current Testing Efforts for 5.3 deliverables
    • imaging issues/autoboxing refinements
    • statwt2 (CAS-10530)
    • systematic tests of mstransform
    • plone/CASAdocs
    • further parallelization testing (non-ALMA pipeline specific)
    • polarization calibration
    • testing snapshot -- please see https://safe.nrao.edu/wiki/bin/view/Software/CASAUserTestingfortheup-to-datelistof tickets and testing status
      • testing ongoing:
        • plotants tickets (Pam, few more)
        • plotms/plotbandpass capability (Pam, CAS-9053)
        • statwt2 (Dave, CAS-10530 and others)
        • imhistory tickets (Dave, several)
        • phase center shift in plotms (CAS-8431)
        • averaging in flagdata (CAS-6215)
        • frequency interpolation of 'linearflag' for BP tables with good solutions (George, CAS-10772)
        • ms tool selectinit (Pam, Sandra, CAS-10818)
        • simulations (Remy, several)
        • ImageCollapser::collapse() should by optimized for degenerate axes (Dave, CAS-9005)
        • Implement ACCOR equivalent (George, Walter, CAS-10366)
        • immath 'poli' implementation (Dave, CAS-9123)

Architecture

HPC

Development

  • CASA 5.2
    • CAS-10853 : Imaging error : Selecting zero rows on one partition... (problem or no?)
    • CAS-10453 : Calibration error : Selecting zero rows (not a problem?)
    • Refconcat image (and) concat inefficiency : Not yet begun to look ( KG is busy with debugging divergence issues + workshop )
    • CAS-10849 : Need a fixed 'common' restoring beam ( Remy says it's a blocker for ALMA, current implementation won't solve it, will need some work around code both in CASA and the pipeline ).
    • Qn for Ryan : What is the PLWG saying w.r.to our current status on this ? Are they going to turn on parallel pipeline next month or no ? We may be ready for something to be turned on, but it may not give perfect speedup everytime. Is CAS-10849 the only thing that's blocking them ?
  • CASA 5.3
    • Filler traffic ( Bob ? Another MS table with duplicate rows ? )
    • Statwt re-requirements (nothing yet for Dave). FindContinuum : Waiting for Todd (end of Oct).
    • A couple of imager crash tickets (hopefully edge cases) : no progress yet.
    • Resource predictor (Federico ? )
    • Anyone/anything else ? ( If there is silence.... in two weeks we'll request everyone to summarize where they are.... )
  • CASA.NEXT
    • A request came in from Luke Maud (via Dirk Petry) about getting an algorithmic detail in WVRCAL fixed. There is a scale factor that can be calculated to improve short-term phase calibration, especially at the highest ALMA frequencies. He has a standalone python 'package' published with a paper describing this that he has made available to the user community on his own, and is asking about getting it into CASA.

Pipeline

  • CASA 5.1 / Pipeline
    • Release accepted. ALMA restore data issues settled sanely for now.
    • Many Cycle 5 lifecycle developments are impacting the pipeline directly or indirectly
  • CASA 5.2 / Pipeline
    • Is the moment 8 and 8 behavior reported above new ?
    • The no automatic clean mask found message may be triggered in serial mode if auto-boxing fails. Don't know if differences between serial and parallel mode are possible given the current status of tclean. Even if so this should not happen all the time.
  • CASA 5.3 / Pipeline
    • Session support infrastructure in testing on pipeline branch. Includes by spw name spw mapping within sessions. This needs to be expanded to imaging tasks.
    • Improved loglevel debug and trace support
    • Cleanup of pipeline display code
    • Ongoing investigation of hif_checkproductsize inefficiencies
    • improved WVR flagging heuristics

AOB

  • Planning - Python 3 - Python 2.x end of life - 2020
  • (NRAO Only) PEPs - Diversity goal

Developer Reports

Thursday Meeting
  • Sanjay Bhatnagar
  • Sandra Castro
  • Lindsey Davis
    • Many email discussions on ALMA lifecycle 5 issues which adversely impact the pipeline
    • Discussion of pipeline issues with Ryan Raba
    • Pipeline weblog issue discussions with PWG
    • Ongoing pipeline display code cleanup
    • PEP
  • Bjorn Emonts
  • Pam Ford
  • Enrique Garcia
  • Bob Garwood
    • More time spent looking into asdm2MS and bdflags2MS execution speeds (CAS-10665, CAS-10732). I think a lot of the reported difference is just differences in data size (i.e. the reporter didn't have a good sense of the expected difference). It's still too slow, and there are hints that if it uses much more memory that one might guess and so in a limited resource case the execution speeds will skyrocket. I'm going to shift away from this and move back to more concrete tickets for awhile.
    • CAS-10693 - remove boost dependency. Started the ICT ticket process to get approval to modify the shared c++ code.
    • CAS-10644 - filler needs to detect duplicate rows. Scope now limited to duplicates in the DATA for WVR data and duplication in POINTING - the two known cases.
    • Some continuing GBT-related work due to the move of the IDL installation here and the decommissioning of gbtidl.cv.nrao.edu.
  • Kumar Golap
  • David Mehringer
  • George Moellenbrock
  • Dirk Petry
  • Martin Pokorny
  • Federico M Pouzols
    • E2E5 parallel pipeline tests, verification of some issues.
    • Scripts and plots for CASA tasks and tclean by size, specmode, gridder, etc. for 5.2 tests.
    • tclean memory use predictor CAS-10768. Produced data and plots and basic code from, pushed back by 5.2 work.
  • Urvashi Rao
    • Multiple meetings and some VLA data reduction workshop work.
  • Darrell Schiebel
  • Ville Suoranta
  • Takahiro Tsutsumi
Friday NAOJ Meeting
  • Kanako Sugimoto
  • Wataru Kawasaki
  • Takeshi Nakazato
  • Renaud Miel

Minutes

Atendance: Ville, Andy, Akeem, Morgan, Jen, Bob, Darrell, Sandra, Federico, Kumar, Juergen, Martin, Lindsey, Ryan, Tak, Urvashi, Bjorn (minutes)

News:
  • Morgan: Please send slides for Casa Users Committee to Morgan by Nov 6.
  • Juergen: VLA Data Reduction Workshop this week in Socorro
  • Juergen: Does CASA have a booth at AAS, like SRDP and ngVLA? Answer: no.
Users:
  • CASA users discussion-group has been meeting in India for over a year. Run by Ruta Kale (NCRA, Pune) and >25 participants. Following CASA Newsletter request, group provided feedback for CASA improvements. Juergen: such feedback should be forwarded to and discussed at the stakeholders meetings.
  • Bjorn: ~300 open JIRA tickets based on unsolved Helpdesk issues from users (30% of all helpdesk-related JIRA tickets since 2007). Likely many can likely be closed, so Bjorn will go over these.
Build, release:
  • Ville: Casacore won't build on XCode9 + OSX 10.12 due to a cmake bug. XCode9 + OS X High Sierra seems fine. Will test updating to Cmake 3.10 as soon as the new Macports port is available. For now, don’t upgrade to 10.13.
  • Ville: Testing new pull request build. This should simplify the merge process by pushing all of the merges through the same test pipeline (closer to what Federico suggested). For now the process PR should be as before. Wait for the prior packaging builds to complete and and issue a PR. New merge expected to be built in ~2 weeks.
  • Question Juergen: When is support OSX 10.11 being dropped? VLA pipeline may need CASA 5.1.2 (patch of 5.1.1). This should be a delta on OSX10.11, not rely on a completely difference OS version.
Verification testing:
  • Andy: HPC verification serial vs. parallel processing -> warning about masking issue. Lindsey: normal error if auto boxing fails. Sandra: problem is that it only occurs in parallel. Lindsey: is this structural issue (Uravshi should have a look), or subtle statistical differences (not dealt with until 5.3). Urvashi and Dirk should have a look at this.
  • Andy: Consistent offset in reference position of only 3-4 decimals between serial and parallel. Kumar: that on scale of a pixel, which is disturbing.
  • Akeem reports on verification testing issue reported in the meeting agenda.
Validation testing:
  • Jen: HPC validation testing ongoing. Technical work 5.2 by Thanksgiving. Lindsey is worried that test will be time-consuming and should start early, but Jen (and Remy previously) make the point that if the code keeps changing, it is no use to start validation testing. CASA team agrees. Jen, Remy and Ryan will get together to discuss. Morgan suggests rolling test (desperate “nooooo” fades in Socorro...)
  • Jen request that the 12-15 open JIRA tickets on HPC processing be ranked on importance before heavy validation testing starts. Sandra adds that HPC-related bugs have been assigned.
HPC:
  • Federico reports on HPC efforts mentioned in meetings agenda.
  • Federico: slowdown in pipeline parallel tasks are amplified on lustre! Open issues to look into are ms-selection, concatenate images and tiling.
  • Urvashi, Sandra and Dirk will coordinate to address CAS-10853. Issue with applycal may also be related to CAS-10853.
  • Discussion on how identical serial and parallel products have to be. Jen: bad for users if serial and parallel give different results, because users won’t trust either method. Ryan: Glendenning recommends differences should be below 6 decimals, otherwise bug. Kumar: are difference absolute or relative? Relative differences are needed to separate bugs from subtle statistical differences. Offline discussion will be held to clarify this. Ryan: These questions need to be answered before starting validation testing.
  • uvmodel issue (CAS-10606/CAS-10685) will be addressed in 5.3 (not 5.2).
Development:
  • Urvashi reports on development issues in meetings agenda, and briefly addresses various “performance” vs “bug-related” issues.
  • Urvashi: Only current big bug to be addressed in 5.2 (CAS-10849) is that a “common” beam in tclean results in a different beam being formed for each chunk of data in parallel processing (although for ALMA differences are small). An interim solution for the pipeline is to run tclean with niter=0, and use the resulting beam as the only common beam in the final imaging.
Pipeline:
  • Sandra reports on pipeline issues in meetings agenda.
  • Restore data problem deferred
  • No regression testing should be done on the 5.3 trunk at the moment.
AOB:
  • Morgan raises two points regarding NRAO only:
    • ICT meeting -> planning for Python 3 (2020), big job! Lindsey: “Oh lord…”
    • PEPS (performance planning) coming up. Included should be support of diversity goals. Kumar ask if diversity goals are for all or managers? Consensus is that it seems to be for “all”, but that it is not clear who are meant by “all”.

-- MorganGriffith - 2017-10-11

This topic: Software/CASA > Software > WebHome > CASAMeetingPage > Oct2617
Topic revision: 2017-11-02, BjornEmonts
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback