User:Taniguchi
Cycle Compatibility and New Tool
About This Guide
Most recently updated for CASA Version 6.6.1 using Python 3.8
This guide describes some examples for creating and perfecting the Total Power (TP) imaging products using the ALMA Cycle 11 Pipeline, for pipeline & manually calibrated data.
If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.
The Section Restore Pipeline Calibration and Prepare for Re-imaging describes the first steps to do. After that, the individual sections are self-contained (and they typically assume the "Restore" has been performed). It illustrates how to completely re-run the pipeline from beginning to end in order to reproduce the pipeline run done at your ARC.
Additional documentation on the Cycle 11 pipeline can be found in the Pipeline User's Guide which can also be found at the ALMA Science Portal. The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the Pipeline WebLog.
Note that the scripts described in this guide have only been tested in Linux (RedHat 8) and Python 3.8. Before CASA 5.6.x, the pipeline scripts are not written in python 3 and may not work properly.
Getting and Starting CASA
If you do not already have CASA installed on your machine, you will have to download and install it.
Download and installation instructions are available here:
http://casa.nrao.edu/casa_obtaining.shtml
CASA 6.6.1.17 is required to reprocess ALMA Cycle 9 data using the scripts in this guide (pipeline-2024.1.0.8).
NOTE: To use pipeline tasks, you must start CASA with
casa --pipeline
Restore Calibration and Prepare for Re-imaging
STEP 1: Follow instructions in your QA2 report for restoring pipeline calibrated data using the *scriptForPI.py. In general, scriptForPI.py is only compatible with CASA versions similar to the one used for its creation. See the Table at https://almascience.org/processing/science-pipeline for details. For running *scriptForPI.py, you move to the "script" folder containing script files and run the script using the following command (modify <uid_name> accordingly):
execfile('member.<uid_name>.scriptForPI.py')
If you want to obtain the data sets that are the same as the products without any changes, just run scriptForPI.py. scriptForPI.py stops at the calibration stage without atmospheric correction and baseline subtraction, if the script folder contains member.<uid_name>.casa_piperestorescript.py. Follow the procedures in ALMA Pipeline User's Guide to obtain the final datasets with atmospheric correction and baseline subtraction.
If the script folder does not contain member.<uid_name>.casa_piperestorescript.py, but contains member.<uid_name>.casa_pipescript.py, scriptForPI.py uses the latter python script. In that case, all of the stages included in casa_pipescript.py are running including atmospheric model correction and baseline subtraction.
In this document, we focus on modification from the pipeline products. Thus, we recommend to run scriptForPI.py without casa_piperestorescript.py to obtain all of the intermit files (<uid_name>.ms, <uid_name>.ms.atmcor.atmtypeX, and <uid_name>.ms.atmcor.atmtype1_bl, please see ALMA Pipeline User's Guide for suffix).
Once completed, the following files and directories will be present, with specific things about pipeline re-imaging noted. More information on the structure in the ALMA Archival Data Primer
- calibrated/
- This directory contains a file(s) called <uid_name>.ms.split.cal (one for each execution in the MOUS), products/, working/, and rawdata/ subdirectories.
- If you run the script without casa_piperestorescript.py, only the working subdirectory is created and all data is stored in this subdirectory.
- calibration/
- This directory contains auxproducts.tgz, auxcaltables.tgz, caltables.tgz, auxproducts.tgz, and flagversions.tgz. The text files (.txt) include information of the applied commands.
- product/
- This directory contains the original image products (fits format).
- qa/
- This directory contains the original weblog and the QA2 and QA0 reports. The QAs reports contain summaries of the scheduling block (SB), and calibration and imaging results.
- If you want to obtain the same results with the QA2 report, you should check the CASA version in the reports and run the scriptForPI.py script with the same version.
- raw/
- This directory contains the raw asdm.sdm(s).
- script/
- This directory contains the file scriptForPI.py (named member.<uid_name>.scriptForPI.py) which internally runs member.<uid_name>.hsd_calimage.casa_piperestorescript.py and other necessary tasks to restore the data.
- The folder also contains member.<uid_name>.hsd_calimage.casa_pipescript.py, a full CASA pipeline script that reproduces all pipeline products.
STEP 2: Change to directory that contains the calibrated data suitable for running pipeline imaging tasks (i.e. *.ms) called "calibrated/working" after the pipeline restore and start CASA 6.6.1.
Copy member.<uid_name>.casa_pipesescript.py
casa --pipeline
STEP 3: Run the following commands in CASA to copy the cont.dat file that contains the frequency ranges used to create the continuum images and the continuum subtraction, and the flag target template file (*.flagtargetstemplate.txt) for each execution which can be used to do science target specific flagging, to the directory you will be working in.
os.system('cp ../../calibration/cont.dat ./cont.dat')
os.system('cp ../../calibration/*.flagtargetstemplate.txt ./*.flagtargetstemplate.txt)
Alternative to scriptForPI: Restore calibrated data in latest CASA (Cycle 5 and later data): You can also use the casa_piperestorescript.py found in the scripts directory to rerun the pipeline in a newer version of CASA. Further instructions can be found in section 5.3 of the ALMA Pipeline Users Guide.
To change applied atmospheric model (hsd_atmcor stage)
The following script runs all necessary pipeline tasks to reproduce the imaging results produced by pipeline.
This example script will produce three types of images: multifrequency synthesis (specmode='mfs') image for each spw without continuum subtraction, continuum (specmode='cont') image for aggregate spws excluding line channels identified by hif_findcont task and linecube (specmode='cube') image for each spw. Depending on the data volume, the channel binning, image size, cell size, number of sources, and spectral windows for imaging can be mitigated by hif_checkproductsize.
# Be sure to edit mymss!
The relevant tasks for imaging pipeline reprocessing described in this CASA guide are hifa_importdata, hif_mstransform, hifa_flagtargets, hifa_imageprecheck, hif_checkproducts, hif_checkproductsize, hif_uvcontsub, hif_makeimlist, hif_makeimages.
Note 1: One of important features of ALMA pipeline is to check the final imaging product size and make necessary adjustments to the channel binning, cell size, image size and possibly the number of fields to be imaged. These are modified to avoid creating large images and cubes that take up significant computing resources and are not necessary for user's science goals. hif_checkproductsize task does this job and we insert this task in all imaging example script in below. We recommend that users copy hif_checkproductsize task from the provided casa_pipescript.py without changing parameters: maxcubelimit, maxproductsize and maxcubesize. However users can comment it out if they don't want this size mitigation or they can explicitly specify the nbins, hm_imsize and hm_cell parameters in hif_makeimlist task.
Note 2: hifa_imageprecheck calculates the synthesized beam and estimates the sensitivity for the aggregate bandwidth and representative bandwidth, for three values of the robust parameter. Then the best robust value is chosen based on heuristics, for subsequent imaging. Therefore if a user wants to use a different robust value from the user's own choice, hifa_imageprecheck should not be run.
Note 3: hifa_importdata will display QA notifications warning that the Flux catalog service is not being used and that the measurement set may already be processed (see screenshot). This is expected - in operations the best flux catalog values must be used, but since you are not going to use the flux densities for anything here, and the amplitude scale of the data has already been calibrated and all calibrations were applied, you can ignore these messages. You should however delete the flux.csv that gets created, if you are going to run any pipeline calibration steps later, and instead use the flux.csv that was delivered with your dataset.

Note 4: If the user is not satisfied with the clean mask from automasking algorithm, it is possible to change the automasking parameters in hif_makeimages by setting pipelinemode='interactive'. For detailed instructions and guides for using automasking, please consult the Automasking Guide
For further reference, the description of pipeline tasks for interferometric and single dish data reduction can be found in the Pipeline Reference Manual
Common Re-imaging Examples
Next, chose the example below that best fits your use case. Due to the need to preserve the indentation of the python commands, the examples will work best if you copy the entire block of python commands (orange-shaded regions) for a particular example into its own python script, check that the indentation is preserved, edit the USER SET INPUTS section, and then execute the file.
Re-determine and Apply Pipeline Continuum Subtraction using Pipeline Tasks
The following script splits off the calibrated science target data for all spws and fields for each execution, applies any flagging commands found in the <uid_name>_flagtargetstemplate.txt file(s) (one for each execution), uses the existing cont.dat file to fit and subtract the continuum emission, leaving the result in the CORRECTED column. Before running this script, you can manually modify both the <uid_name>_flagtargetstemplate.txt file(s) and cont.dat file to add flag commands or change the cont.dat frequency ranges. Once you're happy with the script, you can run it in a CASA session (that was started with the --pipeline option) using execfile(script_name).
## Edit the USER SET INPUTS section below and then execute