Summary of initial meeting for discussion of an advanced tools developer.
Attending: Crystal Brogan, Todd Hunter, Jeff Kern, Mark Lacy, Adam Leroy, Jeff Mangum, Dave Mehringer, Darrell Scheibel
Agenda:
1) Summary of current image-related tools in CASA and what's involved in developing them.
2) Discussion of Possible Relationships to CASA
3) Brief discussion of topics
Next Meeting (proposed): 21 March, when Jeff Kern should be in town. Goal will be to resolve among the three paths that Jeff outline (see below) and see if there is a consensus skill list.
Open Items (see below): I will try to fill in the open informational items below leading up to the next meeting.
Current CASA Contents
CASA's viewer can browse cubes or rastered measurement sets. It has pan, zoom, WCS, and movie functionality for browsing. The viewer can make overlays, channel maps, and produce hard copies to either graphics or postscript files. Additional viewer tools include a spectral profile browser, shape and region management, and on-the-fly region statistics.
The viewer is wrapped by two tasks: "imview" and "msview." Thanks to the EVLA emphasis on visualizing autoflagging, msview is a high priority for the next development cycle.
In addition to the viewer, there are a number of analysis tasks. These give the ability to perform basic mathematical operations combining several images (immath using the LEL syntax), carry out gaussian fits, smooth, make moments, run statistics. There are also capabilities for regridding, and doing spectral smoothing.
Current development (Dave) in the tools area has focused on statistics, collapsing images, 2-d fitting, and the underlying fitters. The 1d line fitter was overhauled last summer and though it currently lacks a graphic interface it now works and accepts estimates.
Development Notes
The viewer is implemented in Qt, which is portable, platform-independent and is able to render to a webkit. Axis labeling is done by PGPLOT, though only a small part of the functionality of that package is used. This introduces a fortran dependency and limits labeling options but gives scalable fonts and a nice interface with postscript and coordinates. There is currently a stronger-than-necessary split between gui-independent and gui-related code inside the viewer. Currently Darrell is the viewer experet (emphasis on singular).
The tasks are in a combination of python and C++. In theory, the python should be very thin layer. In practice most tasks have quite a bit of python processing code. This is problematic because python in tasks is not readily reusable, though one can work around this via shared python libraries/modules. Because of fast development during the move to C++, the code underlying the image analysis tasks/tools, lives in a single 8000+ line C++ class. This needs significant cleanup and breaking apart into more manageable, focused classes.
The viewer and many of the image analysis tasks interact with regions. These were noted in the meeting to be heterogeneously defined throughout CASA (e.g., CLEAN and viewer use different definitions; there are apparently ~5 or 6 different definitions around). There is currently a proposal from Juergen Ott and Miriam Krauss to clean these up.
Mark notes that the long run the rapid growth of data sets mean that it becomes desirable to do as much server-side reduction as possible.
Currently there are a number of cube visualization tools: sdvision and gridview in IDL, starlink in GAIA, ds9 (see DSG wiki). Licensing and uncertain development future suggests that most of these shouldn't be depended on for functionality. All of these programs work on FITS files, as does the current CASA viewer --- this is a requirement for the viewer.
Inside CASA an important piece of information is the three-tier structure: the C++ algorithms are written and often accessed only by the developer, the python tasks wrap these and are written by both developers and (potentially) engaged scientists, and the script level is written by scientists with minimal expectation of generality or robustness.
Jeff and Crystal refered to ALMA specifications requiring a fully capable viewer that needs to operate on CASA data. These weren't trivial to recover (see below) but we'll describe them here.
Relation to CASA
Jeff Kern outlined three possible paths:
1) A CASA-integrated developer line managed by Jeff Kern and working closely with Dave and Darrell. The developer would initially work at a deep level focusing on enhanced capabilities driven by ALMA-related requirements (modulo any need to build up expertise).
2) A developer outside of CASA but in close touch with the CASA team. Jeff cited the ESO effort on using the MUSE integral field spectrograph as an example. In this approach, the CASA team offers support, for example, by re-prioritizing existing development items that were slated to be done anyways. This person would be managed independently with vetting by Darrel. The issue raised here was that potential misalignment among priorities could lead to no internal CASA support for this person.
3) Take the viewer and CASA software "as is," and have no formal interface with CASA team beyond the usual feedback channels. This offers by far the most flexibility to develop
NAASC priorities but little support integrating these capabilities into CASA.
Jeff argued that most efficient use of resources is path #1, bringing someone in to work closely with Darrell and David spectral line capabilities, synching that person up with the development cycle.
Mark, Jeff M, and Jeff K discussed the future of CASA --- is it here in a decade? two? Jeff K thinks that CASA is here for the intermediate term with unique, field-leading imaging and calibration capabilities. He notes that the analysis side is lagging, hence this meeting, but that he and the CASA team would like to bring this up to speed, particularly on the spectral line front.
Jeff Mangum expressed concerns about forcing users through CASA, particularly about any lack of FITS support and asked about modularity of tools. Can the viewer and other tools be separated from CASA? Is there a reason that you would want to do this from a resources, accessibility, or ease of installation perspective?
A Typical Developer Timeline
From Jeff Kern. Reasonable expectations from a new developer are "zero for first 6 months as they learn", "a little production in 1 or 2 areas during months 6-9", "productive in months 9-12" and then a full team member after about a year of rampup.
Followup and Feedback Notes
Areas noted for improvement (mostly in passing):
- Beam units during convolution
- Spectral browser in viewer
- Histogram based statistics (modes, outlier rejection)
- Region homogenization
Questions
- Is there a VO standard for regions?
- What is the specified FITS support for viewer? for CASA importfits?
- Are the dollars for this developer "tagged" in some important ways?
- What are the ALMA requirements for the viewer?
- Where do we need most effort:
- at the toolkit level (scientistist can't do) higher?
- at the interfeace with the user?
- Is Qt dying? Is this a concern?
Answers
- There is not an obvious specification for regions from the VO web pages, though one could build a VO table to describe a region. DS9 regions appear to be the standard based on widespread usage.
- [Reserved Jeff K for blurb on CASA FITS support]
- From John H: The dollars for the developer are not importantly constrained by any operations/construction concern. The CASA contribution from NRAO is an in-kind contribution matched to ESO's contribution of the archive infrastructure. As such there is considerable freedom in how the effort of the developer is allocated. Mark L expressed some skepticism that the picture was quite this simple. JH adds: There is no issue with construction - CASA has never been paid by construction. So that is a red herring. However, the NAASC supports 3.5 FTE of CASA development (including Darrel, but not Dave) via the "Offsite Support" budget. So far we have been able to devote this entirely to whatever the CASA project decides, but in the future this might come under purview of the international partnership. The document written by CIPT about ALMA software support into operations seems to preserve this support through 2015. In addition to this, under the Guidance Budget, the NAASC has 3 FTE of software developer/application developer support. This includes 1 FTE for Dave M., 1 FTE for Kelly, and 1 FTE for "SciPgrm2", which I believe is the position being addressed above. There are no strings on these positions, as long as they fulfill the NAASC mission. I should note that the Guidance Budget still needs to be redone considering extra power needs in Chile, but suspect we will NOT sacrifice any of these FTEs for that. It should be noted that there are additional opportunities for even more s/w support outside these two budgets:
- If the CR gets lifted this year, we should have enough carryover from previous years underspends to fund 2 FTE for 2 years to help develop improved s/w. I forget the skillset - JeffK and BG wrote the paragraphs for this
- NRAO is currently soliciting ideas from the community for development "studies", and Fred wants s/w studies to be included. This is the topic of the Development workshop at the NAASC March 21-22 that Al and Todd have been organizing. Anticipated funding level for successful projects is ~$50k/yr for ~3yrs. NRAO/CASA/NAASC are eligible to submit or collaborate with projects for these funds.
- The above program is an attempt to jump-start the larger ALMA Development funding line. NA share reaches ~$6M/yr in 2015. Again, Fred wants some of this spent on s/w. If CASA wants to be supported above the baseline "maintenance" level, it should think of some cool things to propose to do
- NRAO recently solicited internal ideas for development projects, and software projects can be part of this. See email from S. Marks to staff, signed by Lory Wingate, which links to the following site: https://staff.nrao.edu/wiki/bin/view/CV/WebHome
[The ALMA requirements for the viewer are not obvious from the software systems specification]
[Darrell and Jeff K on future of Qt --- appears to be a nonissue from news this week]
Examples of Anticipated Image Plane Analysis