CASA Parallel Processing: Difference between revisions
Line 71: | Line 71: | ||
<pre> | |||
# pclean :: Invert and deconvolve images with parallel engines | # pclean :: Invert and deconvolve images with parallel engines | ||
vis = '' # Name of input visibility file | vis = '' # Name of input visibility file | ||
Line 104: | Line 104: | ||
clusterdef = '' # File that contains cluster definition | clusterdef = '' # File that contains cluster definition | ||
async = False # If true the taskname must be started using pclean(...) | async = False # If true the taskname must be started using pclean(...) | ||
</pre> | |||
== Parallel cleaning (toolkit) == | == Parallel cleaning (toolkit) == |
Revision as of 21:01, 8 May 2012
Overview
This is meant to be a general guide for testing out the early capabilities in CASA for parallel processing. As of the time of this writing (23 August 2011), the tasks flagdata, clearcal, applycal, and partition are all parallelized. In addition, there are two versions of parallel imaging available at the task level, pcont and pcube (for continuum and spectral line imaging, respectively).
Feedback at this stage about what works, as well as what doesn't or could use improvement, will be very helpful. Please send comments, bug reports, and questions to ...
More information may also be found in Jeff Kern's presentation, posted [here].
Setting up for parallel processing
Before you can run tasks with parallelization, you must first set up the machine on which CASA will be running to use SSH keys for password-free login. See [the SSH section of the Gold Book] for instructions.
Parallel processing in CASA is set up to take advantage of both multiple-core machines (as most standard workstations are) as well as shared memory access (as is available in a cluster). However, the NRAO cluster in Socorro also has the distinction of a very fast connection to the Lustre filesystem, which will boost I/O performance by around 2 orders of magnitude of the standard desktop SATA disk. Therefore, I/O-limited operations are unlikely to see much improvement with parallel processing.
Parallelized tasks
This set of tasks described here is current as of 23 August 2011.
partition
In order to perform parallel processing, the original Measurement Set must be subdivided into smaller chunks that can then be farmed out to multiple cores for processing. This new, subdivided MS (called a "multi-MS", or MMS) is created by a task called partition. In many ways, partition resembles split, in that it can also time-average or select only a subset of data. However, frequency averaging is not currently available.
Here is the current set of input parameters for partition:
# partition :: Experimental extension of split to produce multi-MSs vis = '' # Name of input measurement set outputvis = '' # Name of output measurement set createmms = True # Should this create a multi-MS output separationaxis = 'scan' # Axis to do parallelization across numsubms = 64 # The number of SubMSs to create calmsselection = 'none' # Cal Data Selection ('none', 'auto', 'manual') datacolumn = 'data' # Which data column(s) to split out field = '' # Select field using ID(s) or name(s) spw = '' # Select spectral window/channels antenna = '' # Select data based on antenna/baseline timebin = '0s' # Bin width for time averaging timerange = '' # Select data by time range scan = '' # Select data by scan numbers scanintent = '' # Select data by scan intent array = '' # Select (sub)array(s) by array ID number uvrange = '' # Select data by baseline length async = False # If true the taskname must be started using partition(...)
The parameters which are specific to parallel processing are 'createmms', 'separationaxis', 'numsubms', and 'calmsselection'.
It is currently recommended that 'separationaxis' be set to 'default', which should create sub-MSs which are optimized for parallel processing by dividing the data along scan and spectral window boundaries. Other options include 'spw' and 'scan', which would force separation across only one of these axes.
The optimal number of sub-MSs to create will depend on the processing environment; namely, the number of available cores. A reasonable rule of thumb is to create twice as many sub-MSs as there are available cores.
Finally, there is the option to split out data for calibration sources into one "calibration" MS. If 'calmsselection' is set to 'manual', several more options are available:
calmsselection = 'manual' # Cal Data Selection ('none', 'auto', 'manual') calmsname = '' # Name of output measurement set calfield = '' # Field Selection for calibration ms calscan = '' # Select data by scan numbers calintent = '' # Select data by scan intent
Since the standard calibration tasks are not currently parallelized, creating a dedicated calibration MS will likely result in a speed improvement for this part of processing.
flagdata, clearcal, and applycal
flagdata, clearcal, and applycal are parallelized "under-the-hood", which means that they will check to see whether or not they're operating on an MMS prior to running, and act appropriately to take advantage of parallel resources. No special parameter inputs are necessary to take advantage of this behavior.
Parallelized Clean (task pclean)
# pclean :: Invert and deconvolve images with parallel engines vis = '' # Name of input visibility file imagename = '' # Pre-name of output images imsize = [256, 256] # Image size in pixels (nx,ny), symmetric for single value cell = ['1.0arcsec'] # The image cell size in arcseconds. phasecenter = '' # Image center: direction or field index stokes = 'I' # Stokes params to image (eg I,IV,IQ,IQUV) mask = [] # mask image field = '' # Field Name or id spw = '' # Spectral windows e.g. '0~3', '' is all ftmachine = 'ft' # Fourier Transform Engine ('ft', 'sd', 'mosaic' or # 'wproject') alg = 'hogbom' # Deconvolution algorithm ('clark', 'hogbom', 'multiscale') majorcycles = 1 # Number of major cycles niter = 500 # Maximum number of iterations gain = 0.1 # Gain to use in deconvolution threshold = '0.0mJy' # Flux level to stop cleaning, must include units: '1.0mJy' weighting = 'natural' # Type of weighting mode = 'mfs' # Clean mode ('continuum', 'cube') interactive = False # Interactive clean overwrite = True # Overwrite an existing model image uvtaper = False # Apply additional uv tapering of visibilities selectdata = True # Other data selection parameters timerange = '' # Range of time to select from data uvrange = '' # Select data within uvrange antenna = '' # Select data based on antenna/baseline scan = '' # Scan number range observation = '' # Observation ID range pbcorr = False # Correct for the primary beam post deconvolution clusterdef = '' # File that contains cluster definition async = False # If true the taskname must be started using pclean(...)
Parallel cleaning (toolkit)
As of this writing, parallel imaging is only available at the tool level. Ultimately, the plan is to make these into standard CASA tasks.
pcont
pcont is the parallelized continuum imaging tool. In order to access this functionality, one must first import two items into CASA:
# In CASA
from parallel.pimager import pimager
from parallel.parallel_task_helper import ParallelTaskHelper
Then, to initialize an imaging run:
ImgObj = pimager()
This should launch the parallel engines, and output will be printed to the terminal window to notify that a set of engines has been started. Once this completes, the pcont tool may be called with the following parameters (here is a representative set):
imgObj.pcont(msname='/full/path/to/MS',
imagename='/full/path/to/output/image',
imsize=[1280, 1280],
pixsize=['6arcsec', '6arcsec'],
phasecenter='J2000 13h30m08 47d10m25',
field='2',
spw='0~7:1~62',
stokes='I',
ftmachine='ft',
wprojplanes=128,
maskimage='/full/path/to/mask',
majorcycles=3,
niter=100,
threshold='0.0mJy',
alg='clark',
weight='natural',
robust=0.0,
contclean=False,
visinmem=False)
Although this does not include the full suite of options available in clean, it includes a basic set of commonly-used parameters.