CASA Parallel Processing
Overview
This is meant to be a general guide for testing out the early capabilities in CASA for parallel processing. As of the time of this writing (23 August 2011), the tasks flagdata, applycal, and partition are all parallelized. In addition, there are two versions of parallel imaging available at the task level, pcont and pcube (for continuum and spectral line imaging, respectively).
Feedback at this stage about what works, as well as what doesn't or could use improvement, will be very helpful. Please send comments, bug reports, and questions to ...
More information may also be found in Jeff Kern's presentation, posted [here].
Setting up for parallel processing
Before you can run tasks with parallelization, you must first set up the machine on which CASA will be running to use SSH keys for password-free login. See [the SSH section of the Gold Book] for instructions.
Parallel processing in CASA is set up to take advantage of both multiple-core machines (as most standard workstations are) as well as shared memory access (as is available in a cluster). However, the NRAO cluster in Socorro also has the distinction of a very fast connection to the Lustre filesystem, which will boost I/O performance by around 2 orders of magnitude of the standard desktop SATA disk. Therefore, I/O-limited operations are unlikely to see much improvement with parallel processing.
Parallelized tasks
This is a set current as of 23 August 2011.
partition
In order to perform parallel processing, the original Measurement Set must be subdivided into smaller chunks that can then be farmed out to multiple cores for processing. This new, subdivided MS (called a "multi-MS", or MMS) is created by a task called partition. In many ways, partition resembles split, in that it can also time-average or select only a subset of data. However, frequency averaging is not currently available.