ALMA Cycle 11 Imaging Pipeline Reprocessing
Cycle Compatibility and New Tool
In Cycle 9, a new nomenclature was adopted for measurement sets within the ALMA pipeline: uid*targets.ms for the continuum + line (non-continuum-subtracted) target-only data, and uid*targets_line.ms to reference the continuum subtracted data. Data restored with a scriptForPI.py from prior to Cycle 9 will have an incompatible uid*target.ms format, and must be modified to uid*targets.ms to work with the scripts in this guide.
Note as well that in Cycle 11, the CASA task 'uvcontsub' was modified to no longer use uvcont tables, and instead only takes a cont.dat file. The hif_uvcontfit task has been removed from the ALMA Cycle 11 Pipeline.
Additionally, to improve the ease of imaging pipeline reprocessing, a new tool has been developed to streamline the methods detailed below. The documentation for this tool can be found at Imaging Pipeline Reprocessing Tool.
About This Guide
Most recently updated for CASA Version 6.6.1 using Python 3.8
This guide describes some examples for creating and perfecting the interferometric imaging products using the ALMA Cycle 11 Pipeline, for pipeline & manually calibrated data. And it does NOT work for a concatenated measurement set, often named as calibrated_final.ms .
If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.
The Section Restore Pipeline Calibration and Prepare for Re-imaging describes the first steps to do. After that, the individual sections are self-contained (and they typically assume the "Restore" has been performed). It illustrates how to completely re-run the pipeline from beginning to end in order to reproduce the pipeline run done at your ARC.
Additional documentation on the Cycle 11 pipeline can be found in the Pipeline User's Guide which can also be found at the ALMA Science Portal. The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the Pipeline WebLog.
Note that the scripts described in this guide have only been tested in Linux.
Getting and Starting CASA
If you do not already have CASA installed on your machine, you will have to download and install it.
Download and installation instructions are available here:
http://casa.nrao.edu/casa_obtaining.shtml
CASA 6.6.1.17 is required to reprocess ALMA Cycle 11 data using the scripts in this guide.
NOTE: To use pipeline tasks, you must start CASA with
casa --pipeline
Restore Calibration and Prepare for Re-imaging
STEP 1: Follow instructions in your QA2 report for restoring pipeline calibrated data using the scriptForPI.py. In general, scriptForPI.py is only compatible with CASA versions similar to the one used for its creation. See the Table at https://almascience.org/processing/science-pipeline for details. NOTE: the SPACESAVING parameter cannot be larger than 1, and for pipeline calibrated and imaged data, scriptForPI.py does not automatically split science spectral windows.
Once completed, the following files and directories will be present, with specific things about pipeline re-imaging noted:
- calibrated/
- This directory contains a file(s) called <uid_name>.ms.split.cal (one for each execution in the MOUS) -- these type of files have been split to contain the calibrated pipeline uv-data in the DATA column, and only the science spectral window ids (spws), though importantly the spws have been re-indexed to start at zero, i.e. they will not match spws listed in the pipeline weblog or other pipeline produce products like the science target flag template files (*.flagtargetstemplate.txt) or continuum ranges (cont.dat). Though this type of file has been the starting point for manual ALMA imaging, ms.split.cal files CANNOT BE DIRECTLY USED IN THE EXAMPLES GIVEN IN THIS GUIDE.
- Provided that the restore is done with a SPACESAVING=1, within the calibrated directory there is a "working" directory which does contain the <uid_name>.ms (i.e. no split has been run on them) that is of the form expected as the starting point of the ALMA imaging pipeline. This directory also contains the *.flagtargetstemplate.txt for each execution which can be used to do science target specific flagging. This is the best location to do ALMA imaging pipeline reprocessing. (Older data sets may have a <uid_name>.calibration directory rather than "working".) Place your edited copy of the sample script below into the "working" directory to run it.
- calibration/
- This directory contains a continuum range file named "cont.dat", with the frequency ranges identified by the pipeline as being likely to only contain continuum emission. If the cont.dat is present in the "calibrated/working" directory where pipeline imaging tasks are run, it will be used.
- This directory also contains the *.flagtargetstemplate.txt for each execution which can be used to do science target specific flagging.
- product/
- Contains the original image products.
- qa/
- Contains the original weblog and the QA2 and QA0 reports. The QAs reports contains summaries of the scheduling block (SB), and calibration and imaging results.
- raw/
- Contains the raw asdm(s).
- script/
- Contains the file scriptForPI.py (named member.<uid_name>.scriptForPI.py) which internally runs member.<uid_name>.hifa_calimage.casa_piperestorescript.py and other necessary tasks to restore the data.
- Also contains member.<uid_name>.hifa_calimage.casa_pipescript.py, a full CASA pipeline script that reproduces all pipeline products and <mous_name>.hifa_calimage.casa_commands.log which contains all the equivalent casa commands run during the course of the pipeline processing, in particular the tclean commands to make the image products.
STEP 2: Change to directory that contains the calibrated data suitable for running pipeline imaging tasks (i.e. *.ms) called "calibrated/working" after the pipeline restore and start CASA 6.6.1.
casa --pipeline
STEP 3: Run the following commands in CASA to copy the cont.dat file that contains the frequency ranges used to create the continuum images and the continuum subtraction, and the flag target template file (*.flagtargetstemplate.txt) for each execution which can be used to do science target specific flagging, to the directory you will be working in.
os.system('cp ../../calibration/cont.dat ./cont.dat')
os.system('cp ../../calibration/*.flagtargetstemplate.txt ./*.flagtargetstemplate.txt)
Alternative to scriptForPI: Restore calibrated data in latest CASA (Cycle 5 and later data)
After retrieving and untarring the archive tarballs, one finds the following directories
- calibration/
- Contains the original pipeline calibration and flagging information.
- raw/
- Contains the raw asdm(s), with "asdm.sdm" appended.
Create a set of three directories anywhere, with these contents:
- products/
- this entire directory can be a symbolic link to your calibration/ directory retrieved from the archive, or you can copy the contents from that directory to this one. It will only be read, not written to. You will also need to copy the *pipeline_manifest.xml file from the script/ directory to this location.
- rawdata/
- this needs to contain links to all of the ASDMs in your raw/ directory retrieved from the archive, *with the "asdm.sdm" removed* e.g.
ln -s ..../raw/uid___A002_Xe3da01_X18fa.asdm.sdm uid___A002_Xe3da01_X18fa
- working/
In working/, start casa
casa --pipeline
Find script/member.[MOUS UID].hifa_[recipe].casa_piperestorescript.py in the materials from the archive. It should look like this
__rethrow_casa_exceptions = True
h_init()
try:
hifa_restoredata (vis=['uid___A002_Xe3da01_X18fa'], session=['session_1'], ocorr_mode='ca')
finally:
h_save()
If the script contains additional commands e.g. "fixsyscaltimes" or "fixplanets", note that the call to import these tasks have changed from 'from recipes.almahelpers import fixsyscaltimes' and 'from tasks import fixplanets' to 'from casarecipes.almahelpers import fixsyscaltimes' and 'from casatasks import fixplanets', respectively, for CASA versions 6.1 and above. You can run that script in working/:
execfile("member.uid___A001_X1465_X2182.hifa_calimage.casa_piperestorescript.py")
CASA Imaging Pipeline Script with Automatically Chosen Parameters by Pipeline
The following script runs all necessary pipeline tasks to reproduce the imaging results produced by pipeline.
This example script will produce three types of images: multifrequency synthesis (specmode='mfs') image for each spw without continuum subtraction, continuum (specmode='cont') image for aggregate spws excluding line channels identified by hif_findcont task and linecube (specmode='cube') image for each spw. Depending on the data volume, the channel binning, image size, cell size, number of sources, and spectral windows for imaging can be mitigated by hif_checkproductsize.
# Be sure to edit mymss!
import os
import sys
__rethrow_casa_exceptions = True
mymss = ['uid___A002_Xbe2ed7_Xb524.ms']
for myms in mymss:
if not os.path.exists(myms+'.flagversions'):
print('Not found: '+myms+'.flagversions')
sys.exit('ERROR: you must provide the flagversions files for all input MSs.')
try:
# initialize the pipeline
h_init()
# load the data
hifa_importdata(vis=mymss,dbservice=False)
# if you do not have a check source, comment out these two stages:
hif_makeimlist(intent='CHECK')
hif_makeimages()
# imageprecheck selects the robust parameter for tclean
hifa_imageprecheck()
# you can change these parameters (or comment out the step entirely)
# based on your computing resources, and this
# stage will mitigate the imaging parameters accordingly
hif_checkproductsize(maxcubesize=40.0, maxcubelimit=100.0, maxproductsize=500.0)
# splits out the target data
hif_mstransform()
# flag the target data
hifa_flagtargets()
# make a list of expected targets to be cleaned in mfs mode (used for
# continuum subtraction)
hif_makeimlist(specmode='mfs')
# find continuum frequency ranges
hif_findcont()
# fit and subtract the continuum
hif_uvcontsub()
# make clean mfs images for the selected targets
hif_makeimages()
# make a list of expected targets to be cleaned in cont
# (aggregate over all spws) mode, and make the images
hif_makeimlist(specmode='cont')
hif_makeimages()
# make a list of expected targets to be cleaned in continuum subtracted
# cube mode and make the images (comment out if you have cont-only data)
hif_makeimlist(specmode='cube')
hif_makeimages()
# Selfcal: skip these stages if you know your target cannot be self-calibrated,
# otherwise the pipeline will decide whether selfcal improves the data, and
# remake the mfs, cont, and cube images if successful
hif_selfcal()
hif_makeimlist(specmode='mfs', datatype='selfcal')
hif_makeimages()
hif_makeimlist(specmode='cont', datatype='selfcal')
hif_makeimages()
hif_makeimlist(specmode='cube', datatype='selfcal')
hif_makeimages()
# export the images to fits files (only needed if one wants fits files)
hifa_exportdata()
finally:
h_save()
The relevant tasks for imaging pipeline reprocessing described in this CASA guide are hifa_importdata, hif_mstransform, hifa_flagtargets, hifa_imageprecheck, hif_checkproducts, hif_checkproductsize, hif_uvcontsub, hif_makeimlist, hif_makeimages.
Note 1: One of important features of ALMA pipeline is to check the final imaging product size and make necessary adjustments to the channel binning, cell size, image size and possibly the number of fields to be imaged. These are modified to avoid creating large images and cubes that take up significant computing resources and are not necessary for user's science goals. hif_checkproductsize task does this job and we insert this task in all imaging example script in below. We recommend that users copy hif_checkproductsize task from the provided casa_pipescript.py without changing parameters: maxcubelimit, maxproductsize and maxcubesize. However users can comment it out if they don't want this size mitigation or they can explicitly specify the nbins, hm_imsize and hm_cell parameters in hif_makeimlist task.
Note 2: hifa_imageprecheck calculates the synthesized beam and estimates the sensitivity for the aggregate bandwidth and representative bandwidth, for three values of the robust parameter. Then the best robust value is chosen based on heuristics, for subsequent imaging. Therefore if a user wants to use a different robust value from the user's own choice, hifa_imageprecheck should not be run.
Note 3: hifa_importdata will display QA notifications warning that the Flux catalog service is not being used and that the measurement set may already be processed (see screenshot). This is expected - in operations the best flux catalog values must be used, but since you are not going to use the flux densities for anything here, and the amplitude scale of the data has already been calibrated and all calibrations were applied, you can ignore these messages. You should however delete the flux.csv that gets created, if you are going to run any pipeline calibration steps later, and instead use the flux.csv that was delivered with your dataset.
Note 4: If the user is not satisfied with the clean mask from automasking algorithm, it is possible to change the automasking parameters in hif_makeimages by setting pipelinemode='interactive'. For detailed instructions and guides for using automasking, please consult with the link below.
For reference, the description of pipeline tasks for interferometric and single dish data reduction can be found in the Pipeline Reference Manual
Common Re-imaging Examples
Next, chose the example below that best fits your use case. Due to the need to preserve the indentation of the python commands, the examples will work best if you copy the entire block of python commands (orange-shaded regions) for a particular example into its own python script, check that the indentation is preserved, edit the USER SET INPUTS section, and then execute the file.
Restore Pipeline Continuum Subtraction and Manually Make Image Products
Re-determine and Apply Pipeline Continuum Subtraction using Pipeline Tasks
The following script splits off the calibrated science target data for all spws and fields for each execution, applies any flagging commands found in the <uid_name>_flagtargetstemplate.txt file(s) (one for each execution), uses the existing cont.dat file to fit and subtract the continuum emission, leaving the result in the CORRECTED column. Before running this script, you can manually modify both the <uid_name>_flagtargetstemplate.txt file(s) and cont.dat file to add flag commands or change the cont.dat frequency ranges. Once you're happy with the script, you can run it in a CASA session (that was started with the --pipeline option) using execfile(script_name).
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
###########################################################
## USER SET INPUTS
## Select a title for the weblog
context.project_summary.proposal_code='Restore Continuum Subtraction'
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
############################################################
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
try:
## Load the *.ms files into the pipeline
hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
## Split off the science target data into its own ms (called
## *targets.ms) and apply science target specific flags
hif_mstransform(pipelinemode=pipelinemode)
hifa_flagtargets(pipelinemode=pipelinemode)
## Fit and subtract the continuum using the cont.dat for all spws all fields
hif_uvcontsub(pipelinemode=pipelinemode)
finally:
h_save()
Make Images Manually (turn this into self cal section?)
At this point you will have created a *targets.ms for each execution of your SB. Each of these measurement sets contains the original calibrated continuum + line data in the DATA column and the calibrated continuum subtracted data in the the CORRECTED column. The new CASA task for imaging tclean (which is used by the ALMA Pipeline) allows the user to select which column to use for imaging. tclean also allows a list for the vis parameter so that it is not necessary to concat the data before imaging.
NOTE: If you think you might want to self-calibrate your data using either the continuum or line emission it is ESSENTIAL that you first split off the column that you want to operate on before imaging. Otherwise, the CORRECTED column containing the continuum subtracted data will be overwritten when applycal is run during the self-calibration process.
To manually clean your data at this stage, there are two options:
- Use modified versions of the relevant tclean commands from the "logs/<MOUS_name>.hifa_calimage.casa_commands.log". These are the exact commands originally run by the imaging pipeline to produce your imaging products.
- They will contain within them the frequency ranges (from the cont.dat) used for making the various images.
- There will be two tclean commands per image product, the first with an image name containing iter0 only makes a dirty image, while the second with iter1 makes a cleaned image.
- For example to make the aggregate continuum image but with interactive clean masking, simply copy the corresponding iter1 command (it will contain all of the spw numbers in its name), but set interactive=True, calcres=True, calcpsf=True, restart=False. Additionally set mask=. If you are using the *.targets.ms file(s) you can keep datacolumn='DATA'.
- Note if you are trying to save the model, i.e. for self-calibration, you must also set savemodel='modelcolumn' (or virtual). Also be aware that exiting from interactive clean using the Red X symbol in the interactive viewer, does not save the model in 4.7.0 tclean. To fill the model after stopping this way, rerun same clean command (being careful not to remove existing files) except set restart=True, calcpsf=False, calcres=False, niter=0, interactive=False. This re-run only takes a couple minutes with these settings.
- If you have split off the data of interest for self-calibration (as recommended above), you will first need to image the datacolumn='DATA'. After applying a self-calibration table, you will want to image the datacolumn='CORRECTED'. This should happen by default in typical data reduction use cases since TCLEAN defaults to using the CORRECTED column (when it exists) for imaging, and automatically falls back to the DATA column (if it does not exist).
- Use examples on the casaguide page TCLEAN_and_ALMA to formulate your own special purpose commands.
Revise the Continuum Ranges (cont.dat) Before Pipeline Continuum Subtraction and Remake Pipeline Images
This example uses the pipeline imaging tasks to remake the pipeline imaging products for one spw (17 in the example) after manually editing the cont.dat file.
## Edit the cont.dat file(s) for the spw(s) you want
## to change the continuum subtraction for. In this example
## spw 17 was changed.
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
###########################################################
## USER SET INPUTS
## Select a title for the weblog
context.project_summary.proposal_code = 'NEW CONTSUB'
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
## Select spw(s) that have new cont.dat parameters
## If all spws have changed use MySpw=''
MySpw='17'
############################################################
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
try:
## Load the *.ms files into the pipeline
hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
## Split off the science target data into its own ms (called
## *targets.ms) and apply science target specific flags
hif_mstransform(pipelinemode=pipelinemode)
hifa_flagtargets(pipelinemode=pipelinemode)
## Fit and subtract the continuum using revised cont.dat for all spws
hif_makeimlist(specmode='mfs',spw=MySpw)
hif_uvcontsub(pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## calculate the synthesized beam and estimate the sensitivity
## for the aggregate bandwidth and representative bandwidth
## for three values of the robust parameter.
hifa_imageprecheck(pipelinemode=pipelinemode)
## check the imaging product size and adjust the relevent
## imaging parameters (channel binning, cell size and image size)
## User can comment this out if they don't want size mitigation.
hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
## Make new aggregate cont
hif_makeimlist(specmode='cont',pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## Make new continuum subtracted cube for revised spw(s)
hif_makeimlist(specmode='cube',spw=MySpw,pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## Export new images to fits format if desired.
hifa_exportdata(pipelinemode=pipelinemode)
finally:
h_save()
Restore Pipeline Continuum Subtraction for Subset of SPWs and Fields and Use Channel Binning for Pipeline Imaging of Cubes
Using Pipeline Tasks
This example uses the pipeline imaging tasks to remake the cubes for a subset of spws and fields with channel binning and a more naturally-weighted Briggs robust parameter.
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
###########################################################
## USER SET INPUTS
## Select a title for the weblog
context.project_summary.proposal_code = 'SUBSET CUBE IMAGING'
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
## Select spw(s) to image and channel binning for each spcified
## MySpw. All spws listed in MySpw must have a corresponding MyNbins
## entry, even if it is 1 for no binning.
MySpw='17,23'
MyNbins='17:8,23:2'
## Select subset of fields to image. MUST be field name, index will not work.
## To select all fields, set MyFields=''
MyFields='CoolSource1,CoolSource2'
## Select Briggs Robust factor for data weighting (affects angular
## resolution of images)
MyRobust=1.5
############################################################
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
try:
## Load the *.ms files into the pipeline
hifa_importdata(vis=MyVis, dbservice=False, pipelinemode=pipelinemode)
## Split off the science target data into its own ms (called
## *targets.ms) and apply science target specific flags
## In this example we split off all science targets and science
## spws, however hif_mstransform could also contain the spw and field
## selections
hif_mstransform(pipelinemode=pipelinemode)
hifa_flagtargets(pipelinemode=pipelinemode)
## Fit and subtract the continuum using existing cont.dat
## for selected spws and fields only.
hif_makeimlist(specmode='mfs')
hif_uvcontsub(spw=MySpw,field=MyFields,pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## calculate the synthesized beam and estimate the sensitivity
## for the aggregate bandwidth and representative bandwidth
## for three values of the robust parameter.
## Don't need to run this task if you will use a different robust value anyway.
## hifa_imageprecheck(pipelinemode=pipelinemode)
## check the imaging product size and adjust the relevent
## imaging parameters (channel binning, cell size and image size)
## User can comment this out if they don't want size mitigation.
hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
## Make new continuum subtracted cube for selected spw(s) and fields
hif_makeimlist(specmode='cube',spw=MySpw,nbins=MyNbins,field=MyFields,robust=MyRobust, pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## Export new images to fits format if desired.
hifa_exportdata(pipelinemode=pipelinemode)
finally:
h_save()
Remake images with uvtaper
This example uses the pipeline imaging tasks to remake the pipeline imaging products with uvtaper.
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
###########################################################
## USER SET INPUTS
## Select a title for the weblog
context.project_summary.proposal_code = 'NEW IMAGE WITH UVTAPER'
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
# To make sense of using uvtaper the most, use robust = +2.0 which corresponds to the natural weighting
MyRobust=2.0
############################################################
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
try:
## Load the *.ms files into the pipeline
hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
## Split off the science target data into its own ms (called
## *targets.ms) and apply science target specific flags
hif_mstransform(pipelinemode=pipelinemode)
hifa_flagtargets(pipelinemode=pipelinemode)
## Fit and subtract the continuum using revised cont.dat for all spws
hif_uvcontsub(pipelinemode=pipelinemode)
## calculate the synthesized beam and estimate the sensitivity
## for the aggregate bandwidth and representative bandwidth
## for three values of the robust parameter.
## Don't need to run this task if you will use a different robust value anyway.
## hifa_imageprecheck(pipelinemode=pipelinemode)
## check the imaging product size and adjust the relevent
## imaging parameters (channel binning, cell size and image size)
## User can comment this out if they don't want size mitigation.
hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
## Make new aggregate cont
hif_makeimlist(specmode='cont',robust=MyRobust, uvtaper=['1arcsec'], pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## Make new continuum subtracted cube
hif_makeimlist(specmode='cube', robust=MyRobust, uvtaper=['1arcsec'], pipelinemode=pipelinemode)
hif_makeimages(pipelinemode=pipelinemode)
## Export new images to fits format if desired.
hifa_exportdata(pipelinemode=pipelinemode)
finally:
h_save()