User:Taniguchi: Difference between revisions

From CASA Guides
Jump to navigationJump to search
Taniguchi (talk | contribs)
Taniguchi (talk | contribs)
 
(150 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Cycle Compatibility and New Tool ==
[[Category:ALMA]]
== Cycle Compatibility and New Tool of Single-Dish Pipeline ==
From Cycle 11 (Pipeline 2024), hsd_imaging calls '''tsdimaging''', not the former sdimaging. The pipeline stage hsd_imaging grids/images total power and spectral data according to a specified gridding kernel.


== About This Guide ==
== About This Guide ==
Line 8: Line 10:
If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.  
If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.  


The Section [[#Restore Pipeline Calibration and Prepare for Re-imaging (all Options)|Restore Pipeline Calibration and Prepare for Re-imaging]] describes the first steps to do. After that, the individual sections are self-contained (and they typically assume the "Restore" has been performed). It illustrates how to completely re-run the pipeline from beginning to end in order to reproduce the pipeline run done at your ARC.
The Section [[#Restore Pipeline Calibration and Prepare for Re-imaging (all Options)|Restore Pipeline Calibration and Prepare for Re-imaging]] describes the first steps. After that,  


Additional documentation on the Cycle 11 pipeline can be found in the [https://almascience.org/processing/alma_pipeline_user_s_guide_for_release_2024-1.pdf Pipeline User's Guide] which can also be found at [https://almascience.nrao.edu/processing/science-pipeline the ALMA Science Portal]. The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the
Additional documentation on the Cycle 11 pipeline can be found in the [https://almascience.org/processing/alma_pipeline_user_s_guide_for_release_2024-1.pdf Pipeline User's Guide] which can also be found at [https://almascience.nrao.edu/processing/science-pipeline the ALMA Science Portal].
Pipeline WebLog.  
After that, the individual sections show how you can change the applied atmospheric model and the baseline in the pipeline task.
The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the Pipeline WebLog.  


Note that the scripts described in this guide have only been tested in Linux (RedHat 8) and Python 3.8. Before CASA 5.6.x, the pipeline scripts are not written in python 3 and may not work properly.
Note that the scripts described in this guide have only been tested in Linux (RedHat 8) and Python 3.8.  
Before Cycle 7 data whose scripts are written for before CASA 5.6.x versions (Python 2.X), their pipeline scripts may not run properly after with CASA 6.6.X version (Python 3.X).


== Getting and Starting CASA ==
== Getting and Starting CASA ==


If you do not already have CASA installed on your machine, you will have to download and install it.
If you have not installed CASA on your machine, you will have to download and install it.
 
Download and installation instructions are available here:
Download and installation instructions are available here:
http://casa.nrao.edu/casa_obtaining.shtml
http://casa.nrao.edu/casa_obtaining.shtml


CASA 6.6.1.17 is required to reprocess ALMA Cycle 11 data using the scripts in this guide (pipeline-2024.1.0.8).
In this guide, we process ALMA Cycle 8 data with CASA 6.6.1.17 and pipeline-2024.1.0.8.
 
To use pipeline tasks, you must start CASA with  
NOTE: To use pipeline tasks, you must start CASA with  


<pre style="background-color: #fffacd;">
<pre style="background-color: #fffacd;">
Line 33: Line 34:
== Restore Calibration and Prepare for Re-imaging ==
== Restore Calibration and Prepare for Re-imaging ==


'''STEP 1:''' Follow instructions in your QA2 report for restoring pipeline calibrated data using the *scriptForPI.py. In general, scriptForPI.py is only compatible with CASA versions similar to the one used for its creation. See the Table at https://almascience.org/processing/science-pipeline for details.
'''STEP 1:''' Follow instructions in your QA2 report for restoring pipeline calibrated data using the '''<code>*scriptForPI.py</code>'''. In general, '''<code>scriptForPI.py</code>''' is only compatible with CASA versions similar to the one used for its creation. See the Table at https://almascience.org/processing/science-pipeline for details.
For running *scriptForPI.py, you move to the "script" folder containing script files and run the script using the following command (modify <uid_name> accordingly):  
For running <code>*scriptForPI.py</code>, you move to the "script" folder containing script files and run the script using the following command (modify <uid_name> accordingly):  


<source lang="python">
<source lang="python">
DOSPLIT = True
execfile('member.<uid_name>.scriptForPI.py')
execfile('member.<uid_name>.scriptForPI.py')
</source>
</source>


If you want to obtain the data sets that are the same as the products without any changes, just run '''<code>scriptForPI.py</code>'''.
'''<code>scriptForPI.py</code>''' stops at the calibration stage without atmospheric correction and baseline subtraction, if the script folder contains '''<code>member.<uid_name>.casa_piperestorescript.py</code>'''.
Follow the procedures in [https://almascience.nrao.edu/processing/science-pipeline ALMA Pipeline User's Guide] to obtain the final datasets with atmospheric correction and baseline subtraction.
If the script folder '''does not contain <code>member.<uid_name>.casa_piperestorescript.py</code>''', but '''contains <code>member.<uid_name>.casa_pipescript.py</code>''', '''<code>scriptForPI.py</code>''' uses the latter script.
In that case, all of the stages including atmospheric model correction and baseline subtraction are processed.
Running '''<code>scriptForPI.py</code>''' with '''<code>casa_pipescript.py</code>''' enables us to obtain all of the intermediate files (<uid_name>.ms, <uid_name>.ms.atmcor.atmtypeX, and <uid_name>.ms.atmcor.atmtype1_bl, please see [https://almascience.nrao.edu/processing/science-pipeline ALMA Pipeline User's Guide] for suffix).


Once completed, the following files and directories will be present, with specific things about pipeline re-imaging noted. More information on the structure in the [https://almascience.nrao.edu/documents-and-tools/cycle11/archive-primer ALMA Archival Data Primer]
Once completed, the following files and directories will be present. More information on the structure in the [https://almascience.nrao.edu/documents-and-tools/cycle11/archive-primer ALMA Archival Data Primer]
* calibrated/
* calibrated/
** This directory contains a file(s) called <uid_name>.ms.split.cal (one for each execution in the MOUS) -- these type of files have been split to contain the calibrated pipeline uv-data in the DATA column, and only the science spectral window ids (spws), though importantly the spws have been re-indexed to start at zero, i.e. they will not match spws listed in the pipeline weblog or other pipeline produce products like the science target flag template files (*.flagtargetstemplate.txt) or continuum ranges (cont.dat). Though this type of file has been the starting point for manual ALMA imaging,  ms.split.cal files CANNOT BE DIRECTLY USED IN THE EXAMPLES GIVEN IN THIS GUIDE. A ms.split file can be used or the .ms file found in the working direcotry of the calibrated directory.
** This directory contains a file(s) called <uid_name>.ms.split.cal (one for each execution in the MOUS), products/, working/, and rawdata/ subdirectories. In the working/ subdirectory, <uid_name>.ms are restored. For further data processing, see [https://almascience.nrao.edu/processing/science-pipeline ALMA Pipeline User's Guide].
** Provided that the restore is done with a SPACESAVING=1, within the calibrated directory there is a "working" directory which does contain the <uid_name>.ms (i.e. no split has been run on them) that is of the form expected as the starting point of the ALMA imaging pipeline. This directory also contains the *.flagtargetstemplate.txt for each execution which can be used to do science target specific flagging. '''This is the best location to do ALMA imaging pipeline reprocessing.''' (Older data sets may have a <uid_name>.calibration directory rather than "working".) Place your edited copy of the sample script below into the "working" directory to run it.
** If you run the script without <code>casa_piperestorescript.py</code>, only the working subdirectory is created and all data is stored in this subdirectory.
* calibration/
* calibration/
** This directory contains auxproducts.tgz, auxcaltables.tgz, caltables.tgz, auxproducts.tgz, and flagversions.tgz. The text files (.txt) include information of the applied commands.  
** This directory contains auxproducts.tgz, auxcaltables.tgz, caltables.tgz, auxproducts.tgz, and flagversions.tgz. The text files (.txt) include information of the applied commands.  
Line 52: Line 63:
* qa/
* qa/
** This directory contains the original weblog and the QA2 and QA0 reports. The QAs reports contain summaries of the scheduling block (SB), and calibration and imaging results.  
** This directory contains the original weblog and the QA2 and QA0 reports. The QAs reports contain summaries of the scheduling block (SB), and calibration and imaging results.  
** If you want to obtain the same results with the QA2 report, you should check the CASA version in the reports and run the scriptForPI.py script with the same version.
** If you want to obtain the same results with the QA2 report, you should check the CASA version in the reports and run the '''<code>scriptForPI.py</code>''' script with the same version.
* raw/  
* raw/  
** This directory contains the raw asdm.sdm(s).  
** This directory contains the raw asdm.sdm(s).  
* script/  
* script/  
** This directory contains the file scriptForPI.py (named member.<uid_name>.scriptForPI.py) which internally runs member.<uid_name>.hsd_calimage.casa_piperestorescript.py and other necessary tasks to restore the data.
** This directory contains the file scriptForPI.py (named '''<code>member.<uid_name>.scriptForPI.py</code>''') which internally runs '''<code>member.<uid_name>.hsd_calimage.casa_piperestorescript.py</code>''' and other necessary tasks to restore the data.
** The folder also contains member.<uid_name>.hsd_calimage.casa_pipescript.py, a full CASA pipeline script that reproduces all pipeline products.
** The folder also contains '''<code>member.<uid_name>.hsd_calimage.casa_pipescript.py</code>''', a full CASA pipeline script that reproduces all pipeline products.
 
 
Here, we run the '''<code>scriptForPI.py</code>''' without '''<code>casa_piperestorescript.py</code>''' to obtain the final images and evaluate the pipeline procedures.
To do that, run the following command in the script folder before you run '''<code>member.<uid_name>.scriptForPI</code>'''.


'''STEP 2:''' Change to directory that contains the calibrated data suitable for running pipeline imaging tasks (i.e. *.ms) called "calibrated/working" after the pipeline restore and start CASA 6.6.1.
<pre style="background-color: #fffacd;">
<pre style="background-color: #fffacd;">
casa --pipeline
#In the script folder
mv *casa_piperestorescript.py ../
</pre>
</pre>


'''STEP 3:''' Run the following commands in CASA to copy the cont.dat file that contains the frequency ranges used to create the continuum images and the continuum subtraction, and the flag target template file (*.flagtargetstemplate.txt) for each execution which can be used to do science target specific flagging, to the directory you will be working in.
You can find the results of each pipeline task in the WebLog.
<source lang="python">
The WebLog is created under the subdirectory "calibrated/working/pipeline-*/html/".
os.system('cp ../../calibration/cont.dat ./cont.dat')
The following shows an example of the contents of '''<code>casa_pipescript.py</code>''' after running '''<code>scriptForPI.py</code>'''.
os.system('cp ../../calibration/*.flagtargetstemplate.txt ./*.flagtargetstemplate.txt)
</source>
 
Alternative to scriptForPI: Restore calibrated data in latest CASA (Cycle 5 and later data): You can also use the casa_piperestorescript.py found in the scripts directory to rerun the pipeline in a newer version of CASA. Further instructions can be found in section 5.3 of the [https://almascience.nrao.edu/processing/alma_pipeline_user_s_guide_for_release_2024-1.pdf ALMA Pipeline Users Guide].
 
<br>
 
== CASA Imaging Pipeline Script with Automatically Chosen Parameters by Pipeline ==
 
The following script runs all necessary pipeline tasks to reproduce the imaging results produced by pipeline.
 
This example script will produce three types of images: multifrequency synthesis (specmode='mfs') image for each spw without continuum subtraction, continuum (specmode='cont') image for aggregate spws excluding line channels identified by hif_findcont task and linecube (specmode='cube') image for each spw. Depending on the data volume, the channel binning, image size, cell size, number of sources, and spectral windows for imaging can be mitigated by hif_checkproductsize.  


<source lang="python">
<source lang="python">
# Be sure to edit mymss!
context = h_init()
import os
context.set_state('ProjectSummary', 'proposal_code', '2021.1.00172.L')
import sys
context.set_state('ProjectSummary', 'proposal_title', 'unknown')
__rethrow_casa_exceptions = True
context.set_state('ProjectSummary', 'piname', 'unknown')
 
context.set_state('ProjectStructure', 'ous_entity_id', 'uid://A001/X1525/X290')
mymss = ['uid___A002_Xbe2ed7_Xb524.ms']
context.set_state('ProjectStructure', 'ous_part_id', 'X476396803')
 
context.set_state('ProjectStructure', 'ous_title', 'Undefined')
for myms in mymss:
context.set_state('ProjectStructure', 'ps_entity_id', 'uid://A001/X1525/X294')
    if not os.path.exists(myms+'.flagversions'):
context.set_state('ProjectStructure', 'ousstatus_entity_id', 'uid://A001/X15a0/X18e')
        print('Not found: '+myms+'.flagversions')
context.set_state('ProjectStructure', 'ppr_file', '/opt/pipelinedriver/2023JUN/mnt/dataproc/2021.1.00172.L_2023_09_06T09_21_41.745/SOUS_uid___A001_X1590_X30a8/GOUS_uid___A001_X1590_X30a9/MOUS_uid___A001_X15a0_X18e/working/PPR_uid___A001_X15a0_X18f.xml')
        sys.exit('ERROR: you must provide the flagversions files for all input MSs.')
context.set_state('ProjectStructure', 'recipe_name', 'hsd_calimage')
try:
try:
     # initialize the pipeline
     hsd_importdata(vis=['uid___A002_Xfc10af_X33f1', 'uid___A002_Xfc10af_X40b7', 'uid___A002_X10bf6e3_X59f', 'uid___A002_X10bf6e3_Xe1a', 'uid___A002_X10c0a33_X5a9c', 'uid___A002_X10c2033_X5b9', 'uid___A002_X10c2033_X6c55'], session=['session_3', 'session_3', 'session_4', 'session_4', 'session_5', 'session_6', 'session_7'])
    h_init()
     hsd_flagdata(pipelinemode="automatic")
   
     h_tsyscal(pipelinemode="automatic")
    # load the data
     hsd_tsysflag(pipelinemode="automatic")
    hifa_importdata(vis=mymss,dbservice=False)
     hsd_skycal(pipelinemode="automatic")
 
     hsd_k2jycal(dbservice=False)
    # if you do not have a check source, comment out these two stages:
     hsd_applycal(pipelinemode="automatic")
    hif_makeimlist(intent='CHECK')
     hsd_atmcor(pipelinemode="automatic")
    hif_makeimages()
     hsd_baseline(pipelinemode="automatic")
 
     hsd_blflag(pipelinemode="automatic")
    # imageprecheck selects the robust parameter for tclean
     hsd_baseline(pipelinemode="automatic")
    hifa_imageprecheck()
     hsd_blflag(pipelinemode="automatic")
 
     hsd_imaging(pipelinemode="automatic")
    # you can change these parameters (or comment out the step entirely)
    #  based on your computing resources, and this
    #  stage will mitigate the imaging parameters accordingly
    hif_checkproductsize(maxcubesize=40.0, maxcubelimit=100.0, maxproductsize=500.0)
 
    # splits out the target data
    hif_mstransform()
 
    # flag the target data
    hifa_flagtargets()
 
    # make a list of expected targets to be cleaned in mfs mode (used for
    #  continuum subtraction)
    hif_makeimlist(specmode='mfs')
 
    # find continuum frequency ranges
    hif_findcont()
 
    # fit and subtract the continuum
    hif_uvcontsub()
 
    # make clean mfs images for the selected targets
    hif_makeimages()
 
    # make a list of expected targets to be cleaned in cont
    #  (aggregate over all spws) mode, and make the images
    hif_makeimlist(specmode='cont')
     hif_makeimages()
 
     # make a list of expected targets to be cleaned in continuum subtracted
    #  cube mode and make the images (comment out if you have cont-only data)
     hif_makeimlist(specmode='cube')
     hif_makeimages()
 
    # Selfcal:  skip these stages if you know your target cannot be self-calibrated,
    #  otherwise the pipeline will decide whether selfcal improves the data, and
     #  remake the mfs, cont, and cube images if successful
    hif_selfcal()
     hif_makeimlist(specmode='mfs', datatype='selfcal')
     hif_makeimages()
     hif_makeimlist(specmode='cont', datatype='selfcal')
     hif_makeimages()
     hif_makeimlist(specmode='cube', datatype='selfcal')
     hif_makeimages()
      
 
    # export the images to fits files (only needed if one wants fits files)
    hifa_exportdata()
 
finally:
finally:
     h_save()
     h_save()
</source>
</source>


The relevant tasks for imaging pipeline reprocessing described in this CASA guide are hifa_importdata, hif_mstransform, hifa_flagtargets, hifa_imageprecheck, hif_checkproducts, hif_checkproductsize, hif_uvcontsub, hif_makeimlist, hif_makeimages.


'''Note 1''': One of important features of ALMA pipeline is to check the final imaging product size and make necessary adjustments to the channel binning, cell size, image size and possibly the number of fields to be imaged. These are modified to avoid creating large images and cubes that take up significant computing resources and are not necessary for user's science goals. hif_checkproductsize task does this job and we insert this task in all imaging example script in below. We recommend that users copy hif_checkproductsize task from the provided casa_pipescript.py without changing parameters: maxcubelimit, maxproductsize and maxcubesize. However users can comment it out if they don't want this size mitigation or they can explicitly specify the nbins, hm_imsize and hm_cell parameters in hif_makeimlist task.  
'''STEP 2:''' Copy files in the script folder to the 'calibrated/working' directory. Change to a directory that contains the calibrated data suitable for running pipeline imaging tasks (i.e. *.ms) called "calibrated/working" after the pipeline restore.
<source lang="python">
# In the script directory
cp member.uid* ../calibrated/working/


'''Note 2''': hifa_imageprecheck calculates the synthesized beam and estimates the sensitivity for the aggregate bandwidth and representative bandwidth, for three values of the robust parameter.  Then the best robust value is chosen based on heuristics, for subsequent imaging. Therefore if a user wants to use a different robust value from the user's own choice, hifa_imageprecheck should not be run.
# Move to "calibrated/working/" subdirectory
 
cd ../calibrated/working/
'''Note 3''': hifa_importdata will display QA notifications warning that the Flux catalog service is not being used and that the measurement set may already be processed (see screenshot). This is expected - in operations the best flux catalog values must be used, but since you are not going to use the flux densities for anything here, and the amplitude scale of the data has already been calibrated and all calibrations were applied, you can ignore these messages. You should however delete the flux.csv that gets created, if you are going to run any pipeline calibration steps later, and instead use the flux.csv that was delivered with your dataset.
</source>
 
[[File:Hifa_importdata.png|thumb|<caption>QA warnings for hifa_importdata: see note 3</caption>]]
 
'''Note 4''': If the user is not satisfied with the clean mask from automasking algorithm, it is possible to change the automasking parameters in hif_makeimages by setting pipelinemode='interactive'. For detailed instructions and guides for using automasking, please consult the [[Automasking_Guide_CASA_6.6.1 | Automasking Guide]]
 
 
For further reference, the description of pipeline tasks for interferometric and single dish data reduction can be found in the [http://almascience.org/processing/reference-manual-2024.pdf Pipeline Reference Manual]


== Common Re-imaging Examples==
After running '''<code>scriptForPI.py</code>''', information is added in '''<code>casa_pipescript.py</code>''' etc.
These files are necessary for the following tasks.
In the following task, edit '''<code>casa_pipescript.py</code>''' in the "calibrated/working" subdirectory.


Next, chose the example below that best fits your use case. Due to the need to preserve the indentation of the python commands, the examples will work best if you copy the entire block of python commands (orange-shaded regions) for a particular example into its own python script, check that the indentation is preserved, '''edit the USER SET INPUTS section''', and then execute the file.
== Change applied atmospheric model (hsd_atmcor stage) ==


----
You can find which atmospheric model is applied from the measurement set name: <uid>.ms.atmcor.atmtypeX, where X (1-4) indicates the applied model; atmType=1 (tropical), 2 (mid-latitude summer), 3 (mid-latitude winter), and 4 (subarctic summer).
Details about the atmospheric models are summarized in [https://iopscience.iop.org/article/10.1088/1538-3873/abe0ab/pdf Sawada et al. (2021)].
The pipeline applies the most optimal one for each EB. You can check the other models in the ATM Heurestics Plots in the WebLog (see the attached figure).


==== Re-determine and Apply Pipeline Continuum Subtraction using Pipeline Tasks ====
[[File:hsd_atmcor.png|thumb|<caption>hsd_atmcor stage results in Weblog.]]


The following script splits off the calibrated science target data for all spws and fields for each execution, applies any flagging commands found in the <uid_name>_flagtargetstemplate.txt file(s) (one for each execution), uses the existing cont.dat file to fit and subtract the continuum emission, leaving the result in the CORRECTED column. Before running this script, you can manually modify both the <uid_name>_flagtargetstemplate.txt file(s) and cont.dat file to add flag commands or change the cont.dat frequency ranges. Once you're happy with the script, you can run it in a CASA session (that was started with the --pipeline option) using execfile(script_name).
If you want to change the applied atmospheric model, you can select them by giving either a single integer (apply to all EBs) or a list of integers (models per EB) to atmtype parameter.
Furthermore, you can constrain parameters such as antenna and spectral window. See more details in the [https://almascience.nao.ac.jp/processing/reference-manual-2024.pdf Pipeline Reference Manual].  


You can edit the pipeline hsd_atmcor command in '''<code>casa_pipescript.py</code>''' as below:
<source lang="python">
<source lang="python">
## Edit the USER SET INPUTS section below and then execute
hsd_atmcor(spw = '25', atmtype=2)
## this script (note it must be in the 'calibrated/working' directory.
 
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
 
###########################################################
## USER SET INPUTS
 
## Select a title for the weblog
context.project_summary.proposal_code='Restore Continuum Subtraction'
 
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
 
 
############################################################
 
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
 
try:
    ## Load the *.ms files into the pipeline
    hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
 
    ## Split off the science target data into its own ms (called
    ## *targets.ms) and apply science target specific flags
    hif_mstransform(pipelinemode=pipelinemode)
    hifa_flagtargets(pipelinemode=pipelinemode)
 
    ## Fit and subtract the continuum using the cont.dat for all spws all fields
    hif_uvcontsub(pipelinemode=pipelinemode)
 
finally:
    h_save()
 
</source>
</source>


==== Make Images Manually ====
In most cases, the automatically selected models work well. The strong ozone lines sometimes affect this stage.


At this point you will have created a *targets.ms for each execution of your SB. Each of these measurement sets contains the original calibrated continuum + line data in the DATA column and the calibrated continuum subtracted data in the the CORRECTED column. The new CASA task for imaging {{tclean_6.6.1}} (which is used by the ALMA Pipeline) allows the user to select which column to use for imaging.  {{tclean_6.6.1}} also allows a list for the ''vis'' parameter so that it is not necessary to {{concat_6.6.1}} the data before imaging.
== Change baseline subtraction (hsd_baseline stage)==


'''NOTE:''' If you think you might want to self-calibrate your data using either the continuum or line emission it is '''ESSENTIAL''' that you first split off the column that you want to operate on before imaging. Otherwise, the CORRECTED column containing the continuum subtracted data will be overwritten when {{applycal_6.6.1}} is run during the self-calibration process.
At the hsd_baseline stage in the WebLog, you can find the applied baseline as red lines as well as before and after baseline subtraction spectra (blue lines).
The light-blue ranges indicate the ranges where the pipeline judges that lines are detected.
The pipeline conducts baseline fitting using a cubic spline function if there is no specific description.
Broad lines are sometimes subtracted as baselines in a cubic spline function.
The attached figure shows an example of this case (spw 21).


To manually clean your data at this stage, there are two options:
[[File:Spectral_plot_before_subtraction_uid_A002_Xfc10af_X33f1.ms.atmcor_Sgr_A_star_ant0_spw21_pol0.png|thumb|<caption> Example of fail in baseline subtraction. The broad line is subtracted.]]
# Use modified versions of the relevant {{tclean_6.6.1}} commands from the "logs/<MOUS_name>.hifa_calimage.casa_commands.log". These are the exact commands originally run by the imaging pipeline to produce your imaging products.
#* They will contain within them the frequency ranges (from the cont.dat) used for making the various images.
#* There will be two {{tclean_6.6.1}} commands per image product, the first with an image name containing '''iter0''' only makes a dirty image, while the second with '''iter1''' makes a cleaned image.
#* For example to make the aggregate continuum image but with interactive clean masking, simply copy the corresponding '''iter1''' command (it will contain all of the spw numbers in its name), but set interactive=True, calcres=True, calcpsf=True, restart=False. Additionally set mask=''. If you are using the *.targets.ms file(s) you can keep datacolumn='DATA'.
#* Note if you are trying to save the model, i.e. for self-calibration, you must also set savemodel='modelcolumn' (or virtual). Also be aware that exiting from interactive clean using the Red X symbol in the interactive viewer, does not save the model in 4.7.0 {{tclean_6.6.1}}. To fill the model after stopping this way, rerun same clean command (being careful not to remove existing files) except set restart=True, calcpsf=False, calcres=False, niter=0, interactive=False. This re-run only takes a couple minutes with these settings.
#* If you have split off the data of interest for self-calibration (as recommended above), you will first need to image the datacolumn='DATA'. After applying a self-calibration table, you will want to image the datacolumn='CORRECTED'. This should happen by default in typical data reduction use cases since TCLEAN defaults to using the CORRECTED column (when it exists) for imaging, and automatically falls back to the DATA column (if it does not exist).
# Use examples on the casaguide page [[TCLEAN_and_ALMA]] to formulate your own special purpose commands.


=== Make Pipeline Aggregate Continuum Image With All Channels ===
For spw 21, the best baseline is likely the 1st-order polynomial baseline.
 
If you want to change the baseline in the pipeline task, edit the '''<code>casa_pipescript.py</code>''' as follows:
This example moves the cont.dat file to a backup name so it is not picked up by pipeline, in which case all unflagged channels are used to make an aggregate continuum image with no continuum subtraction and default pipeline cleaning. This may be beneficial for continuum only projects for which the hif_findcont stage of the weblog shows that more continuum bandwidth is possible than it identified (i.e. due to noise spikes etc).  


<source lang="python">
<source lang="python">
## Edit the USER SET INPUTS section below and then execute
hsd_baseline(linewindow={21:[500,1900]}, fitfunc='poly', fitorder = 1)
## this script (note it must be in the 'calibrated/working' directory.
 
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
 
###########################################################
## USER SET INPUTS
 
## Select a title for the weblog
context.project_summary.proposal_code='NEW AGGREGATE CONT'
 
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
 
############################################################
 
 
## Move cont.dat to another name if it exists
os.system('mv cont.dat original.cont.dat')
 
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
 
try:
    ## Load the *.ms files into the pipeline
    hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
 
    ## Split off the science target data into its own ms (called
    ## *targets.ms) and apply science target specific flags
    hif_mstransform(pipelinemode=pipelinemode)
    hifa_flagtargets(pipelinemode=pipelinemode)
 
 
    ## calculate the synthesized beam and estimate the sensitivity
    ## for the aggregate bandwidth and representative bandwidth
    ## for three values of the robust parameter. 
    hifa_imageprecheck(pipelinemode="automatic")
 
    ## check the imaging product size and adjust the relevent
    ## imaging parameters (channel binning, cell size and image size)
    ## User can comment this out if they don't want size mitigation.
    hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
 
    ## Skip the continuum subtraction steps and make an aggregate
    ## continuum image with all unflagged channels (file named
    ## cont.dat should NOT be present in directory).
    hif_makeimlist(specmode='cont',pipelinemode=pipelinemode)
    hif_makeimages(pipelinemode=pipelinemode)
 
    ## Export new images to fits format if desired.
    hifa_exportdata(pipelinemode=pipelinemode)
 
finally:
    h_save()
</source>
</source>
The stage hsd_baseline runs twice in casa_pipescript.py and you need to put this command twice.
The linewindow parameter is set at channel range(s) where the line(s) is detected.
This corresponds to the light-blue range(s).
The indicated example means that there is a line from 500 to 1900 channels in spw 21.
The applied baseline function is the 1st-order polynomial.
Regarding how to set the parameters, see more details in the [https://almascience.nao.ac.jp/processing/reference-manual-2024.pdf Pipeline Reference Manual].


----
After changing the parameters you want to change in <code>casa_pipescript.py</code>, run '''<code>member.<uid_name>.casa_pipescript.py</code>''' in CASA.  
=== Revise the Continuum Ranges (cont.dat) Before Pipeline Continuum Subtraction and Remake Pipeline Images ===
 
This example uses the pipeline imaging tasks to remake the pipeline imaging products for one spw (17 in the example) after manually editing the cont.dat file.  
 
<source lang="python">
<source lang="python">
## Edit the cont.dat file(s) for the spw(s) you want
#In the "calibrated/working" subdirectory
## to change the continuum subtraction for. In this example
execfile('member.<uid_name>.casa_pipescript.py')
## spw 17 was changed.
 
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.
 
import glob as glob
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()
 
###########################################################
## USER SET INPUTS
 
## Select a title for the weblog
context.project_summary.proposal_code = 'NEW CONTSUB'
 
## Delete uid*_targets.ms and flagversions if it exists
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')
 
## Select spw(s) that have new cont.dat parameters
## If all spws have changed use MySpw=''
MySpw='17'
 
############################################################
 
 
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
 
try:
    ## Load the *.ms files into the pipeline
    hifa_importdata(vis=MyVis,dbservice=False,pipelinemode=pipelinemode)
 
    ## Split off the science target data into its own ms (called
    ## *targets.ms) and apply science target specific flags
    hif_mstransform(pipelinemode=pipelinemode)
    hifa_flagtargets(pipelinemode=pipelinemode)
 
    ## Fit and subtract the continuum using revised cont.dat for all spws
    hif_makeimlist(specmode='mfs',spw=MySpw)
    hif_uvcontsub(pipelinemode=pipelinemode)
    hif_makeimages(pipelinemode=pipelinemode)
 
    ## calculate the synthesized beam and estimate the sensitivity
    ## for the aggregate bandwidth and representative bandwidth
    ## for three values of the robust parameter. 
    hifa_imageprecheck(pipelinemode=pipelinemode)
 
    ## check the imaging product size and adjust the relevent
    ## imaging parameters (channel binning, cell size and image size)
    ## User can comment this out if they don't want size mitigation.
    hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
 
    ## Make new aggregate cont
   
    hif_makeimlist(specmode='cont',pipelinemode=pipelinemode)
    hif_makeimages(pipelinemode=pipelinemode)   
 
    ## Make new continuum subtracted cube for revised spw(s)
    hif_makeimlist(specmode='cube',spw=MySpw,pipelinemode=pipelinemode)
    hif_makeimages(pipelinemode=pipelinemode)
 
    ## Export new images to fits format if desired.
    hifa_exportdata(pipelinemode=pipelinemode)
 
finally:
    h_save()
</source>
</source>


----
You will obtain the new calibrated measurement sets with baseline subtraction (_bl).
 
The attached figure shows the result of the manual baseline fitting using hsd_baseline(linewindow={21:[500,1900]}, fitfunc='poly', fitorder = 1).
=== Restore Pipeline Continuum Subtraction for Subset of SPWs and Fields and Use Channel Binning for Pipeline Imaging of Cubes ===
 
==== Using Pipeline Tasks ====
 
This example uses the pipeline imaging tasks to remake the cubes for a subset of spws and fields with channel binning and a more naturally-weighted Briggs robust parameter.  
 
<source lang="python">
## Edit the USER SET INPUTS section below and then execute
## this script (note it must be in the 'calibrated/working' directory.


import glob as glob
[[File:Maual_BL_guide2.png|thumb|<caption> Applied manual baseline fit. The 1st-order polynomial line is applied.]]
__rethrow_casa_exceptions = True
pipelinemode='automatic'
context = h_init()


###########################################################
'''NOTE: The current single-dish pipeline applies the same fit function to all spectral windows. Thus, if you change fitfunc = 'poly', this is applied to all of the spectral windows even if the cubic spline function works well in the original results. The results are overwritten and please carefully use this method.'''
## USER SET INPUTS


## Select a title for the weblog
If you want to restore the original results except for a few spectral windows, please keep the results in another directory, and run '''<code>member.<uid_name>.casa_pipescript.py</code>''' in the "calibrate/working/" directory.
context.project_summary.proposal_code = 'SUBSET CUBE IMAGING'  


## Delete uid*_targets.ms and flagversions if it exists
The individual fitting function will be developed in future pipeline versions.  
os.system('rm -rf uid*_targets.ms')
os.system('rm -rf uid*_targets.ms.flagversions')
os.system('rm -rf uid*_targets_line.ms')
os.system('rm -rf uid*_targets_line.ms.flagversions')


## Select spw(s) to image and channel binning for each spcified
Currently, CASA task [https://casadocs.readthedocs.io/en/stable/api/tt/casatasks.single.sdbaseline.html sdbaseline] is recommended as mentioned in [https://almascience.nrao.edu/processing/science-pipeline ALMA Pipeline User's Guide].
## MySpw. All spws listed in MySpw must have a corresponding MyNbins
## entry, even if it is 1 for no binning.
MySpw='17,23'
MyNbins='17:8,23:2'
 
## Select subset of fields to image. MUST be field name, index will not work.
## To select all fields, set MyFields=''
MyFields='CoolSource1,CoolSource2'
 
## Select Briggs Robust factor for data weighting (affects angular
## resolution of images)
MyRobust=1.5
 
############################################################
 
## Make a list of all uv-datasets appended with *.ms
MyVis=glob.glob('*.ms')
 
try:
    ## Load the *.ms files into the pipeline
    hifa_importdata(vis=MyVis, dbservice=False, pipelinemode=pipelinemode)
 
    ## Split off the science target data into its own ms (called
    ## *targets.ms) and apply science target specific flags
    ## In this example we split off all science targets and science
    ## spws, however hif_mstransform could also contain the spw and field
    ## selections
    hif_mstransform(pipelinemode=pipelinemode)
    hifa_flagtargets(pipelinemode=pipelinemode)
 
    ## Fit and subtract the continuum using existing cont.dat
    ## for selected spws and fields only.
    hif_makeimlist(specmode='mfs')
    hif_uvcontsub(spw=MySpw,field=MyFields,pipelinemode=pipelinemode) 
    hif_makeimages(pipelinemode=pipelinemode)
 
    ## calculate the synthesized beam and estimate the sensitivity
    ## for the aggregate bandwidth and representative bandwidth
    ## for three values of the robust parameter. 
    ## Don't need to run this task if you will use a different robust value anyway.
    ## hifa_imageprecheck(pipelinemode=pipelinemode)
 
    ## check the imaging product size and adjust the relevent
    ## imaging parameters (channel binning, cell size and image size)
    ## User can comment this out if they don't want size mitigation.
    hif_checkproductsize(maxproductsize=350.0, maxcubesize=40.0, maxcubelimit=60.0)
 
    ## Make new continuum subtracted cube for selected spw(s) and fields
    hif_makeimlist(specmode='cube',spw=MySpw,nbins=MyNbins,field=MyFields,robust=MyRobust, pipelinemode=pipelinemode)
    hif_makeimages(pipelinemode=pipelinemode)
 
    ## Export new images to fits format if desired.
    hifa_exportdata(pipelinemode=pipelinemode)
 
finally:
    h_save()
</source>

Latest revision as of 23:41, 13 March 2025

Cycle Compatibility and New Tool of Single-Dish Pipeline

From Cycle 11 (Pipeline 2024), hsd_imaging calls tsdimaging, not the former sdimaging. The pipeline stage hsd_imaging grids/images total power and spectral data according to a specified gridding kernel.

About This Guide

Most recently updated for CASA Version 6.6.1 using Python 3.8

This guide describes some examples for creating and perfecting the Total Power (TP) imaging products using the ALMA Cycle 11 Pipeline, for pipeline & manually calibrated data.

If your data were manually imaged by ALMA, you should instead consult the scriptForImaging.py delivered with your data.

The Section Restore Pipeline Calibration and Prepare for Re-imaging describes the first steps. After that,

Additional documentation on the Cycle 11 pipeline can be found in the Pipeline User's Guide which can also be found at the ALMA Science Portal. After that, the individual sections show how you can change the applied atmospheric model and the baseline in the pipeline task. The User's guide describes how to obtain the ALMA Pipeline, how to use it to calibrate and image ALMA interferometric (IF) and single-dish (SD) data, and a description of the Pipeline WebLog.

Note that the scripts described in this guide have only been tested in Linux (RedHat 8) and Python 3.8. Before Cycle 7 data whose scripts are written for before CASA 5.6.x versions (Python 2.X), their pipeline scripts may not run properly after with CASA 6.6.X version (Python 3.X).

Getting and Starting CASA

If you have not installed CASA on your machine, you will have to download and install it. Download and installation instructions are available here: http://casa.nrao.edu/casa_obtaining.shtml

In this guide, we process ALMA Cycle 8 data with CASA 6.6.1.17 and pipeline-2024.1.0.8. To use pipeline tasks, you must start CASA with

casa --pipeline

Restore Calibration and Prepare for Re-imaging

STEP 1: Follow instructions in your QA2 report for restoring pipeline calibrated data using the *scriptForPI.py. In general, scriptForPI.py is only compatible with CASA versions similar to the one used for its creation. See the Table at https://almascience.org/processing/science-pipeline for details. For running *scriptForPI.py, you move to the "script" folder containing script files and run the script using the following command (modify <uid_name> accordingly):

execfile('member.<uid_name>.scriptForPI.py')

If you want to obtain the data sets that are the same as the products without any changes, just run scriptForPI.py.

scriptForPI.py stops at the calibration stage without atmospheric correction and baseline subtraction, if the script folder contains member.<uid_name>.casa_piperestorescript.py.

Follow the procedures in ALMA Pipeline User's Guide to obtain the final datasets with atmospheric correction and baseline subtraction.

If the script folder does not contain member.<uid_name>.casa_piperestorescript.py, but contains member.<uid_name>.casa_pipescript.py, scriptForPI.py uses the latter script.

In that case, all of the stages including atmospheric model correction and baseline subtraction are processed.

Running scriptForPI.py with casa_pipescript.py enables us to obtain all of the intermediate files (<uid_name>.ms, <uid_name>.ms.atmcor.atmtypeX, and <uid_name>.ms.atmcor.atmtype1_bl, please see ALMA Pipeline User's Guide for suffix).

Once completed, the following files and directories will be present. More information on the structure in the ALMA Archival Data Primer

  • calibrated/
    • This directory contains a file(s) called <uid_name>.ms.split.cal (one for each execution in the MOUS), products/, working/, and rawdata/ subdirectories. In the working/ subdirectory, <uid_name>.ms are restored. For further data processing, see ALMA Pipeline User's Guide.
    • If you run the script without casa_piperestorescript.py, only the working subdirectory is created and all data is stored in this subdirectory.
  • calibration/
    • This directory contains auxproducts.tgz, auxcaltables.tgz, caltables.tgz, auxproducts.tgz, and flagversions.tgz. The text files (.txt) include information of the applied commands.
  • product/
    • This directory contains the original image products (fits format).
  • qa/
    • This directory contains the original weblog and the QA2 and QA0 reports. The QAs reports contain summaries of the scheduling block (SB), and calibration and imaging results.
    • If you want to obtain the same results with the QA2 report, you should check the CASA version in the reports and run the scriptForPI.py script with the same version.
  • raw/
    • This directory contains the raw asdm.sdm(s).
  • script/
    • This directory contains the file scriptForPI.py (named member.<uid_name>.scriptForPI.py) which internally runs member.<uid_name>.hsd_calimage.casa_piperestorescript.py and other necessary tasks to restore the data.
    • The folder also contains member.<uid_name>.hsd_calimage.casa_pipescript.py, a full CASA pipeline script that reproduces all pipeline products.


Here, we run the scriptForPI.py without casa_piperestorescript.py to obtain the final images and evaluate the pipeline procedures. To do that, run the following command in the script folder before you run member.<uid_name>.scriptForPI.

#In the script folder
mv *casa_piperestorescript.py ../

You can find the results of each pipeline task in the WebLog. The WebLog is created under the subdirectory "calibrated/working/pipeline-*/html/". The following shows an example of the contents of casa_pipescript.py after running scriptForPI.py.

context = h_init()
context.set_state('ProjectSummary', 'proposal_code', '2021.1.00172.L')
context.set_state('ProjectSummary', 'proposal_title', 'unknown')
context.set_state('ProjectSummary', 'piname', 'unknown')
context.set_state('ProjectStructure', 'ous_entity_id', 'uid://A001/X1525/X290')
context.set_state('ProjectStructure', 'ous_part_id', 'X476396803')
context.set_state('ProjectStructure', 'ous_title', 'Undefined')
context.set_state('ProjectStructure', 'ps_entity_id', 'uid://A001/X1525/X294')
context.set_state('ProjectStructure', 'ousstatus_entity_id', 'uid://A001/X15a0/X18e')
context.set_state('ProjectStructure', 'ppr_file', '/opt/pipelinedriver/2023JUN/mnt/dataproc/2021.1.00172.L_2023_09_06T09_21_41.745/SOUS_uid___A001_X1590_X30a8/GOUS_uid___A001_X1590_X30a9/MOUS_uid___A001_X15a0_X18e/working/PPR_uid___A001_X15a0_X18f.xml')
context.set_state('ProjectStructure', 'recipe_name', 'hsd_calimage')
try:
    hsd_importdata(vis=['uid___A002_Xfc10af_X33f1', 'uid___A002_Xfc10af_X40b7', 'uid___A002_X10bf6e3_X59f', 'uid___A002_X10bf6e3_Xe1a', 'uid___A002_X10c0a33_X5a9c', 'uid___A002_X10c2033_X5b9', 'uid___A002_X10c2033_X6c55'], session=['session_3', 'session_3', 'session_4', 'session_4', 'session_5', 'session_6', 'session_7'])
    hsd_flagdata(pipelinemode="automatic")
    h_tsyscal(pipelinemode="automatic")
    hsd_tsysflag(pipelinemode="automatic")
    hsd_skycal(pipelinemode="automatic")
    hsd_k2jycal(dbservice=False)
    hsd_applycal(pipelinemode="automatic")
    hsd_atmcor(pipelinemode="automatic")
    hsd_baseline(pipelinemode="automatic")
    hsd_blflag(pipelinemode="automatic")
    hsd_baseline(pipelinemode="automatic")
    hsd_blflag(pipelinemode="automatic")
    hsd_imaging(pipelinemode="automatic")
finally:
    h_save()


STEP 2: Copy files in the script folder to the 'calibrated/working' directory. Change to a directory that contains the calibrated data suitable for running pipeline imaging tasks (i.e. *.ms) called "calibrated/working" after the pipeline restore.

# In the script directory
cp member.uid* ../calibrated/working/

# Move to "calibrated/working/" subdirectory
cd ../calibrated/working/

After running scriptForPI.py, information is added in casa_pipescript.py etc. These files are necessary for the following tasks. In the following task, edit casa_pipescript.py in the "calibrated/working" subdirectory.

Change applied atmospheric model (hsd_atmcor stage)

You can find which atmospheric model is applied from the measurement set name: <uid>.ms.atmcor.atmtypeX, where X (1-4) indicates the applied model; atmType=1 (tropical), 2 (mid-latitude summer), 3 (mid-latitude winter), and 4 (subarctic summer). Details about the atmospheric models are summarized in Sawada et al. (2021). The pipeline applies the most optimal one for each EB. You can check the other models in the ATM Heurestics Plots in the WebLog (see the attached figure).

hsd_atmcor stage results in Weblog.

If you want to change the applied atmospheric model, you can select them by giving either a single integer (apply to all EBs) or a list of integers (models per EB) to atmtype parameter. Furthermore, you can constrain parameters such as antenna and spectral window. See more details in the Pipeline Reference Manual.

You can edit the pipeline hsd_atmcor command in casa_pipescript.py as below:

hsd_atmcor(spw = '25', atmtype=2)

In most cases, the automatically selected models work well. The strong ozone lines sometimes affect this stage.

Change baseline subtraction (hsd_baseline stage)

At the hsd_baseline stage in the WebLog, you can find the applied baseline as red lines as well as before and after baseline subtraction spectra (blue lines). The light-blue ranges indicate the ranges where the pipeline judges that lines are detected. The pipeline conducts baseline fitting using a cubic spline function if there is no specific description. Broad lines are sometimes subtracted as baselines in a cubic spline function. The attached figure shows an example of this case (spw 21).

Example of fail in baseline subtraction. The broad line is subtracted.

For spw 21, the best baseline is likely the 1st-order polynomial baseline. If you want to change the baseline in the pipeline task, edit the casa_pipescript.py as follows:

hsd_baseline(linewindow={21:[500,1900]}, fitfunc='poly', fitorder = 1)

The stage hsd_baseline runs twice in casa_pipescript.py and you need to put this command twice. The linewindow parameter is set at channel range(s) where the line(s) is detected. This corresponds to the light-blue range(s). The indicated example means that there is a line from 500 to 1900 channels in spw 21. The applied baseline function is the 1st-order polynomial. Regarding how to set the parameters, see more details in the Pipeline Reference Manual.

After changing the parameters you want to change in casa_pipescript.py, run member.<uid_name>.casa_pipescript.py in CASA.

#In the "calibrated/working" subdirectory
execfile('member.<uid_name>.casa_pipescript.py')

You will obtain the new calibrated measurement sets with baseline subtraction (_bl). The attached figure shows the result of the manual baseline fitting using hsd_baseline(linewindow={21:[500,1900]}, fitfunc='poly', fitorder = 1).

Applied manual baseline fit. The 1st-order polynomial line is applied.

NOTE: The current single-dish pipeline applies the same fit function to all spectral windows. Thus, if you change fitfunc = 'poly', this is applied to all of the spectral windows even if the cubic spline function works well in the original results. The results are overwritten and please carefully use this method.

If you want to restore the original results except for a few spectral windows, please keep the results in another directory, and run member.<uid_name>.casa_pipescript.py in the "calibrate/working/" directory.

The individual fitting function will be developed in future pipeline versions.

Currently, CASA task sdbaseline is recommended as mentioned in ALMA Pipeline User's Guide.