ALMA Imaging Pipeline Reprocessing Tool
Cycle Compatibility
In Cycle 9, a new nomenclature was adopted for measurement sets within the ALMA pipeline: uid*targets.ms for the continuum + line (non-continuum-subtracted) target-only data, and uid*targets_line.ms to reference the continuum subtracted data. Data restored with a scriptForPI.py from prior to Cycle 9 will have an incompatible uid*target.ms format, and must be modified to uid*targets.ms to work with the scripts provided here.
Preparing Data
Starting in Cycle 9, the North American ARC began providing restored calibrated data as an added value product in a calibrated_final/ directory structure. In value added deliveries from previous cycles, the continuum + line data was split and concatenated into the delivered calibrated_final.ms, but this measurement set could no longer be used with pipeline tasks, and the spectral window numbering was lost, which made it difficult to compare against the pipeline weblog.
In the new delivery structure, detailed below, all uid*targets.ms files are held within calibrated_final/measurement_sets, allowing the use of pipeline tasks and making for easier comparison with the delivered ALMA calibration + imaging pipeline weblog. An equivalent calibrated_final.ms can be created via 'concat' in CASA if this is desired.
calibrated_final/ # downloaded as calibrated_final.tgz caltables/ # holds relevant calibration information for continuum subtraction - cont.dat # contains the identified continuum ranges from 'findcont' in the ALMA pipeline - uid*SOURCE.uvcont.tbl # the tables for uv continuum subtraction, generated from 'uvcontfit' measurement_sets/ # holds all measurement sets - uid*targets.ms # the non-continuum-subtracted measurement sets, per execution block - scriptForReprocessing.py # the tool described on this page
For data downloaded from the ALMA Science Archive, it must first be restored using scriptForPI.py and then placed into a compatible directory structure to work with the scriptForReprocessing.py imaging tool. The script reprocessing_prep.py below should be run to do this.
# run this script within the working/ directory to create a calibrated_final/ directory, mirroring the NA added value delivery structure.
# once calibrated_final/ is created, place scriptForReprocessing.py in calibrated_final/ and follow the scriptForReprocessing.py instructions
import glob
import os
import sys
# Check if calibrated_final/ already exists:
if glob.glob("calibrated_final"):
print("ERROR: calibrated_final/ already exists; will not overwrite")
sys.exit()
else:
os.mkdir("calibrated_final")
# Fill the caltables
os.mkdir("calibrated_final/caltables")
os.system("cp -rf cont.dat calibrated_final/caltables")
os.system("cp -rf *uvcont.tbl calibrated_final/caltables")
# Fill the measurement_sets
os.mkdir("calibrated_final/measurement_sets")
# First try just uid*targets.ms
os.system("cp -rf uid*targets.ms calibrated_final/measurement_sets/")
# Then try uid*targets_line.ms
os.system("cp -rf uid*targets_line.ms calibrated_final/measurement_sets/")
print("Generated calibrated_final/ and filled caltables/ and measurement_sets/. Please place scriptForReprocessing.py in calibrated_final/ and follow README instructions.")
About This Tool
This guide describes some examples for perfecting the interferometric imaging products from the ALMA Cycle 9 Pipeline. If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.
The Section Restore Pipeline Calibration and Prepare for Re-imaging describes the first steps to do. After that, the individual sections are self-contained (and they typically assume the "Restore" has been performed). It illustrates how to completely re-run the pipeline from beginning to end in order to reproduce the pipeline run done at your ARC.
Additional documentation on the Cycle 9 pipeline can be found in the Pipeline User's Guide which can also be found at the ALMA Science Portal. The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the Pipeline WebLog.
Note that the scripts described in this guide have only been tested in Linux.
About This Tool - scriptForReprocessing.py
scriptForReprocessing.py is intended to be a convenient wrapper for many of the ALMA pipeline functions that users may wish to use on their NA delivered value-added products. See the ALMA Pipeline Users Guide and Reference Manual for a full description of the ALMA pipeline: https://almascience.nrao.edu/processing/science-pipeline
The script can be launched via CASA with any version of CASA that includes the ALMA pipeline. See the above link for a mapping of ALMA Cycle, CASA version, and Pipeline version. Thus it should be launched as:
$ casa --pipeline -c scriptForReprocessing.py [options]
optional arguments:
-h, --help show this help message and exit --contsub Fit and subtract continuum using the channel ranges from the local cont.dat file. Generates new *uvcont.tbl tables in working_reprocess/ directory and uid*targets_line.ms in measurement_sets/ --contsub_fast Continuum subtract data via uvcontsub and the local *uvcont.tbl files, but only using the CASA commands rather than pipeline calls (faster). Generates uid*targets_line.ms in measurement_sets/ --image [IMAGE] Run the imaging pipeline and place images in the specified directory (default='images'). NOTE: unless cont.dat or the imaging options in this script are modified, the images produced will be identical to those on the ALMA Science Archive --cleanup Remove working_reprocess/ directory and log files after any other options are executed. WARNING: removes weblogs inside of working_reprocess/ --weblog [WEBLOG] Launches a browser to view weblog after other tasks are run. By default ('latest'), displays the latest weblog generated locally. Other options are to use the specific pipeline folder name (e.g. 'pipeline-20221010T192458') --calibrated_final Concatenate uid*targets.ms to produce calibrated_final.ms in measurement_sets/ --calibrated_final_line Concatenate uid*targets_line.ms (if they exist) to produce calibrated_final_line.ms in measurement_sets/
Suggested Workflows
A number of workflows are supported in the new delivery format:
- You can proceed with your scientific analysis starting with the uid*targets.ms files and supply them to CASA tasks such as tclean, uvcontsub, or gaincal as a list (vis=['MS1.ms', 'MS2.ms', etc]). Examining the casa commands for each stage of the delivered ALMA calibration + imaging weblog will give examples of this (e.g. you can get the tclean command for any image that was made by clicking within the relevant hif_makeimages() stage).
- You can use scriptForReprocessing.py to restore the continuum subtracted data, re-image the data in the ALMA pipeline using new imaging parameters, or view the weblog (see below for usage). Here you can also easily modify cont.dat and rerun the continuum subtraction and/or imaging with a different continuum selection.
- You can generate the old style calibrated_final.ms either using scriptForReprocessing.py, or by hand via concat(). If you use scriptForReprocessing.py, there is also an option to generate an analogous calibrated_final_line.ms.