Thursday Morning Meeting 14 September 2017
- DIAL-IN NUMBERS & PASSCODES:
- IP: 22.214.171.124##8110
- Phone: (434) 817-6524
News / Meetings / Visitors
- HPC/pipeline testing - meeting today right after this meeting
- 5.1 status ?
- In the future, we need more coordination with groups outside CASA so that our release dates are not precisely at the time when all our main customers are beginning operations and doing high-pressure last minute testing (and finding bugs). We had planned for a Sep 1 release, with the VLASS starting observing 10 days later and the ALMA pipeline doing concentrated tests all through Sept in order to prepare for operations from Oct 1. ( What is the ideal situation here ? ).
- For 5.2, we don't yet have clear timelines on when the pipeline and PLWG will turn parallelization on, and what all parameters they expect to change as well. We cannot expect any clarity on this until mid-late October (Morgan/Lindsey, is this accurate?).
- 2018 POP Goal: "develop an improved testing system with improved reports for CASA"
- A possible start to answer this could be introducing a source code analysis tool in the build/test process (e.g. for python something like pylint)
- Running pylint on the code today produces:
- Refactor for a good practice metric violation (~2600)
task_makemask:R: 67, 0: Too many branches (195/12) (too-many-branches)
tget:R:117, 7: Unnecessary "else" after "return" (no-else-return)
recipes/tec_maps:R:743, 0: Too many local variables (33/15) (too-many-locals)
tests/test_visstat:R:325, 8: Too many nested blocks (8/5) (too-many-nested-blocks)
- Convention for coding standard violation (~10000)
start_casa:C: 10, 0: Import "from IPython import start_ipython" should be placed at the top of the module (wrong-import-position)
viewertool:C:470,15: Using type() instead of isinstance() for a typecheck. (unidiomatic-typecheck)
usecases/test_task_exportasdm:C:156, 0: Unnecessary parens after 'if' keyword (superfluous-parens)
tests/test_split:C:2082, 0: Exactly one space required after comma
- Warning for stylistic problems, or minor programming issues
tests/test_po_linpolint:W: 69, 0: Unused import casac (unused-import)
vishead_util:W: 9,53: Unused argument 'hdref' (unused-argument)
usecases/test_task_exportasdm:W:176, 4: No exception type(s) specified (bare-except)
task_smoothcal:W: 7, 0: Found indentation with tabs instead of spaces (mixed-indentation)
task_simalma:W:879,24: Redefining built-in 'dir' (redefined-builtin)
- Error for important programming issues (i.e. most probably bug) (~8000, not completely accurate because of external dependencies errors which can be configured to ignore.)
viewertool:E:197,12: Raising NoneType while only classes or instances are allowed (raising-bad-type)
tests/test_flagcmd:E: 8, 0: No name 'default' in module '__main__' (no-name-in-module)
- Fatal for errors which prevented further processing (0)
- Current validation tally
- 117 Under Validation, 102 Ready to Validate.
- 25 tickets went RtV this week so far. 8 tickets went through Validation to Resolved this week so far.
- Sept 1 - 7: 3 tickets assigned for validation. 4 tickets resolved.
- JDM out of office next Thursday (9/21), but will post report to the agenda
- Current Testing Efforts for 5.1 deliverables
- final packages built (5.1.0-71). still waiting on Claire's confirmation that this version is okay to release for VLASS.
- the ALMA pipeline group is (understandably) frustrated that they were not also allowed to hold up the 5.1 release for their own fixes/late VLASS testing of the release packages.
- need to include a known issue in 5.1 casadocs about CAS-10711 (multiscale tclean has a different behavior with regard to the model column depending on whether it stops for niter or threshold).
- Current Testing Efforts for 5.2 deliverables
- coordination for the next level of science validation testing between PLWG/HPC group/Bjorn; time to start discussing once 5.1 is out
- Current Testing Efforts for 5.3 deliverables
- statwt2 (CAS-10530)
- new tester has been identified with time to work on this in September
- systematic tests of mstransform
- further parallelization testing (non-ALMA pipeline specific)
- polarization calibration
- Summary of 5.1 HPC pipeline testing
- We have noticed that pipeline code (other than CASA tasks) runs much slower in parallel than in serial. Two possible places to look at are: pipeline.hif.heuristics.imageparams_base and pipeline.infrastructure.displays.sky.
- One suspect is the imager tool, when running on a Multi-MS.
- We already know that the image analysis tool is also slower in concatenated images ( CAS-10428).
- This tableshows the time taken by the pipeline in serial and in parallel. We separate the time between "CASA tasks" and "Other". "Other" is the time taken in CASA tools and in pipeline Python code. We would expect the tools and other Python code to take the same time when running on an MS or MMS.
- Red cells show the worst cases, when the tools or Python code are much slower on MMS (See T008 and T010).
- The complete red rows show that the total time taken in parallel is slower than in serial. Two things could be the cause here: tclean is not as fast as expected (see the time taken by each task in T025) and the imager tools are also slower in parallel.
- The blue cells (T015) show that the time taken by "Other" is better on MMS than MS. We should expect the time to be the approximately the same. This could be a side-effect of lustre being at > 90%.
- Potential CASA 5.1.1 : Do we do this or no ?
- Solar TCals : A minor addition by GM to gencal will solve this. VLA would like it 'soon' and may be expecting a 5.1.1
- Anything else ?
- CAS-10711 : Need to evaluate whether a log message is to be added or not. Fix only for 5.3.
- CASA 5.2 (fixes for ALMA pipeline parallelization) :
- Path to be taken by HPC group for further testing is being discussed (after this meeting today).
- New bugs :
- CAS-10672 : CountedPtr error : Open...
- CAS-10538 : ArrayBase operator() error : Resolved as non-reproducible
- Anything else ?
- Others for CASA :
- CAS-10697 : uvcontsub with combine='spw' : Bjorn, why is this 5.2 and not 5.3 ?
- CAS-10421 : improve logging for autoboxing : Tak, why is this 5.2 and not 5.3 ?
- CASA 5.3
- Topics already reaching milestones :
- Plotants rewrite + old ticket cleanup : Pam ?
- Filler refactor : Bob ?
- Image-analysis history writing tickets : Dave ?
- Anything else ?
- Parallelization/performance :
- tclean parallel runs + 'savemodel' (+MMS) : Problems and no clear solution yet. ( to be done for 5.2 instead ? )
- Initiated conversation with Todd about findcontinuum migration to C++.
- uvcontsub discussion : Need to work through
- Some discussion about tclean exit criterion issues (again) - CAS-10692, and another multi-scale stopping issue and modelcolumn writes CAS-10711 : To be addressed with iteration control cleanup/N-sigma.
- Found a statwt tester and iterating to try to get the feedback we need in time for development to be done by December
- Anything else ?
- Planning document will appear here : https://open-confluence.nrao.edu/display/CASA/Release+Planning
- Last joint pipeline working group / developer telecon before start of Cycle 5
- Online / pipeline software in reasonable shape, Operations wrapper scripts still under test
- Issue with Xvfb version which could affect wrapper scripts
- Cycle 5 issues
- init.py insert path problem, been there for some time but masked by imports, resolved ?
- hif_checkproductsize parameter tuning bug, resolved
- too long plot file name results in missing plots and web log errors, under investigation
- missing representative source plot, under investigation, too long file name problem ?
- conjugate beams issue in VLASS pipeline, resolved
- VLASS observations have started
- Sanjay Bhatnagar
- Sandra Castro
- Lindsey Davis
- Drafted pipeline update article for CASA newsletter
- Participated in final joint pipeline working group / developer telecon before the start of Cycle 5
- Updates to Cycle 6 pipeline framework development planning page
- Email discussion with Remy on pipeline exportdata / restoredata issues and pipeline product archive ingest
- CASA coordination meeting discussions / emails
- Bjorn Emonts
- Pam Ford
- plotants rewrite (CAS-10598 parent ticket) - several bugs went away, several requested features added
- Enrique Garcia
- Bob Garwood
- Kumar Golap
- Jeff Kern
- David Mehringer
- George Moellenbrock
- CASA newsletter articles on VLA troposphere delay error correction and VI2
- Began some python studies on calibration solve SNR calculation (CAS-8589)
- Dumped coffee on my laptop...not looking good...
- Dirk Petry
- Martin Pokorny
- async VI2 work (CAS-10699, CAS-10671)
- Federico M Pouzols
- Finalizing and summarizing test runs of the pipeline in parallel.
- Investigating slowdown in pipeline execution in parallel mode.
- Trying to catch up with cvel and flagdata tickets.
- Urvashi Rao
- Mostly emails/discussions/writing.
- Darrell Schiebel
- Ville Suoranta
- Takahiro Tsutsumi
- Perley-Butler 2017: Updating the underlying C++ code so that they were aware of the new flux models were checked in. Also the table contains the new flux models was checked in to the data repository as PerleyButler2017Coeffs. Remaining work: expose the standard to the task interface.
Friday NAOJ Meeting
- Kanako Sugimoto
- CASA news letter articles: Single Dish ( CAS-10495), inputs for PL article
- Commented on possible impact of correlator upgrate plan on single dish
- We are going to have a team meeting to review CASA 5.1 release cycle.
- Started looking into existing tickets and thinking of CASA 5.3 plan.
- Wataru Kawasaki
- Masaya Kuniyoshi
- Takeshi Nakazato
- CAS-10683: migration of sdimaging to new imager
- successfully created images identical to older image
- fixed CAS-10694
- reported init.py/prelude.py sys.path issue (issued as CAS-10709)
- Renaud Miel
- NAOJ new CASA development host setup: almost done
- DARED installation support and test