This page describes the hands-on imaging portion of the ALMA data reduction workshop. In this session you will use ALMA Science Verification data to explore CASA's imaging capabilities. This page describes the data and introduces the scripts. It is not a full-fledged CASA guide, only background for the hands on portion of the workshop. For the self-calibration session in the afternoon go here: WorkshopSelfcal
Currently five ALMA Science Verification data sets have been released. The calibrated u-v data for all five projects should appear in your cluster account in the directory "imaging/calib/". Briefly, the projects are:
- NGC3256_Band3 Single-field observations of the nearby starburst galaxy NGC 3256 in Band 3. The data were taken in "TDM" mode and so have low frequency resolution compared to the other data sets here (about 15 MHz per channel). The spectral windows cover the CO J=1-0 line, a CN doublet, and the 3mm continuum. All three types of emission are visible and extended in the data. There is a full CASA Guide describing the reduction and imaging of these data here: NGC3256Band3.
- TWHYA_BAND7 Single-field observations of the debris disk around the nearby pre main sequence star TW Hya in Band 7. These data have high frequency resolution (they are "FDM" mode) and cover the 850 micron continuum, the CO J=3-2 line, and HCO+ J=4-3 line. There is a full CASA Guide describing the reduction and imaging of these data here: TWHydraBand7.
- TWHYA_BAND6 Single-field observations of TW Hya in Band 6. These data have high frequency resolution and cover 1mm continuum, the CO J=2-1 line, and the DCN J=3-2 line. These data do not have an associated CASA guide because the approach closely resembles that used for TW Hya Band 7. They were released with an annotated script, which is included in the same directory as the calibrated data (can be downloaded from the ALMA Science Portal: https://almascience.nrao.edu/alma-data/science-verification).
- TWHYA_BAND3 Single-field observations of TW Hya in Band 3. These data have high frequency resolution and cover 3mm continuum and the HCO+ J=1-0 line. These data do not have an associated CASA guide because the approach closely resembles that used for TW Hya Band 7. They were released with an annotated script, which is included in the same directory as the calibrated data (can be downloaded from the ALMA Science Portal: https://almascience.nrao.edu/alma-data/science-verification).
- Antennae_Band7 Two mosaics targeting the nearby merging galaxies NGC 4038 and NGC 4039 in Band 7. These data have high frequency resolution (though they have been somewhat averaged) and cover the 850 micron continuum and the CO J=3-2 line. These data contain two separate mosaics (collections of pointings), one targeting the northern component of the Antennae galaxies and the other targeting the southern component. The mosaics do not overlap but are close enough that they can be combined into a single image if desired. There is a full CASA Guide describing the reduction and imaging of these data here: AntennaeBand7.
In each case, you should see a single measurement set (".ms" file) inside the directory associated with the data. For these data sets, the bandpass, phase, and amplitude has already been calibrated following the procedures described in the first day of the workshop. The science targets have been separated out from the rest of the data (using the split task), yielding the calibrated measurement sets in your directories. We will use these to explore imaging in CASA.
Imaging in CASA revolves around the clean task. clean has a wide array of modes and options, many of which have been discussed in the morning lecture. For the imaging hands-on session, we provide scripts for imaging the science verification data. These scripts all center on calls to clean. Given the relatively small size of the data sets, and the power of the cluster, you should have no problem running them repeatedly while changing key parameters to explore their effects on the output image. The scripts cover:
- Basic continuum imaging In the first script, you will take any of the data sets and produce a "continuum" image by averaging together data at all frequencies into a single image. This will introduce you to the basic setup of the clean task and using the viewer for interactive cleaning and data inspection. You have the opportunity to explore the effects of different stopping conditions, u-v weighting, and gridding parameters.
- Line imaging Imaging spectral lines introduces two new complexities. The spectral axis needs to be specified via the "velocity", "frequency", or "channel" modes of clean and line and continuum emission need to be separated from one another. We will see how to separate line and continuum emission in the u-v domain using uvcontsub and then build data cubes using clean.
- Mosaic imaging clean can build mosaics out of data sets with multiple pointing centers. We will experiment with this mode on the Antennae data sets.
- Multifrequency synthesis with spectral indices For high signal-to-noise data sets with large fractional bandwidth it can be important to account for the frequency dependence ("spectral index") of continuum emission when constructing an image. CASA's multifrequency synthesis ("mfs") mode allows modeling of sources with frequency-dependent intensity. Using this script you can explore the application of this algorithm to the TW Hydra data sets.
We suggest working through in order of increasing complexity. Each call should take no longer than a minute to run, so there should be plenty of time to experiment with each type of imaging during the two hour session.
Setup: You should already have all of the relevant data in your account in the directory "~/imaging/calib/". The scripts that we provide will copy this data (usually with some averaging) to whatever working directory you set them up in. They assume that you have made some working directory along the lines of "~/imaging/working/basic_imaging/" so that "../../calib/" points to the calibrated Science Verification u-v data. Set up a directory structure that you feel comfortable with (we suggest making "working/basic_imaging", "working/line_imaging", "working/mfs_imaging", and "working/mosaic_imaging") and then download the relevant scripts from this page in to the appropriate directories.
Once you are set up, we recommend that you open each script in turn in your favorite editor and start up a "casapy" session in another window. Most of the scripts have a few preparatory calls, often averaging or copying the data to make a working copy that can be quickly processed. Step through these, reading the comments and making sure that you understand what is going on, then focus on the clean call.
You may also want to read through all of the commentary in the relevant section of this guide as you go. We note a few subtleties and helpful tips here that will be useful to keep in mind as you run the script.
Syntax: In the scripts, tasks other than clean are called using inline parameters, e.g., "split(vis='suchandsuch.ms')", and can be directly pasted in to the command line. This approach is generally favored for scripting because of the ease of copying and ready reproducibility (the task only knows about the parameters that you can see in the call). For the clean calls in these examples we will use the more instructive but potentially messier "AIPS-style" approach. We always initialize clean with its default parameters using:
# In CASA default('clean')
and then we define the parameters of interest to us as global python variables with the appropriate names. At any time, you can check the current inputs to clean by typing
# In CASA inp
CASA will indicate invalid selections with red color-coding.
Once you are happy with the parameters that you have set you can fire off clean by typing
# In CASA go
# In CASA clean
If you get stuck: If you get stuck the first thing that you want to do is type "help taskname" where taskname is the task that you are working on. If you are looking for a task, try tasklist. It's good to get familiar with using the online documentation, so do try this first. Of course during the hands on session, we are here to help so don't hesitate to flag one of us down to ask for help or more information.
Script 1: Basic Continuum Imaging
This script provides your basic introduction to imaging. You need to run it in a directory so that "../../calib/" points to the provided science verification data, or alternatively set the "data" parameter to point to your data file, appropriately.
You can use this script to image the continuum from any and all of the single-field science verification data (just move around the comments and tweak the input parameters). A few comments:
- "help" and "inp" are your friends. Use these to explore any tasks that you are unfamiliar with.
- listobs is always a good command to start with, to get your bearings. Remember that you can direct the output to a file via the "listfile" parameter.
- "#" is the python comment symbol. Lines beginning with this will not be processed by the casapy shell. The script includes some suggestions for alternative settings commented out and you can edit it (it's your copy) to change between these.
- split is used here to copy and average the data in frequency. If you compare the output of listobs among the various data you will see that the NGC 3256 data have much coarser channels to begin with, so we suggest to average these by only a factor of 4. To make the TW Hya data run quickly, averaging by a factor of ~ 100 is more appropriate.
- We use mode "mfs" with "nterms=1". This tells clean to account for the effects of frequency on the position of u-v data, but not to incorporate a spectral index term into the deconvolution.
- We set the script to use clean in interactive mode, meaning that the user will determine the mask and decide when to stop cleaning. This is done via the CASA viewer. See a demo of the CASA viewer here: http://casa.nrao.edu/CasaViewerDemo/casaViewerDemo.html (as described this morning). Key things to note for this application:
- Define a Clean Mask: You will want to define a clean mask using the square or polygon tools. These tools are along the top row of buttons in the viewer. Click the tool icon with the mouse button you wish to assign, then click out a roughly minimal mask that encompasses the emission from the source, then double click inside. The mask should turn white. This mask is saved during cleaning with a ".mask" extension. Do this as a first step before any cleaning. Then later if you see a feature that you believe is real emerge after cleaning you can add a new mask component around it.
- Executing Clean: You will want to use the "green circle" button to tell clean to proceed only with the specified number of iterations and then show you the residuals again to allow you to make a judgment call on whether to clean deeper. If you instead click the "blue arrow" clean will proceed non-interactively until it meets the stopping criteria specified by the number of iterations or the threshold maximum residual. The buttons and windows off to the left let you set the amount of cleaning done before clean rebuilds the image and returns to the viewer.
- Finishing: The viewer is showing you the "residual image", the emission left after removing the current model of the source from the data. Once you are satisfied that the residuals left are indistinguishable from the surrounding noise (or at least that further cleaning won't help) you can stop cleaning by hitting the big red X button.
- You can reuse a mask by specifying the file name to clean in the "mask" parameter (though note that in the script there is an "os.system('rm -rf '+image_out+'.*')" call that removes output files, including any mask files stored in this directory. You can avoid removing the mask files, for example by moving them to a different directory.).
- Note that if the specified output file already exists on disk, by default clean will continue operating on that file rather than creating a new output file. This is why, in the script, any existing output files are removed prior to cleaning. So if you want to restart imaging and deconvolution from the beginning, you need to manually wipe the previous output files from disk. As noted, an unfortunate side effect of removing the output files is that you also remove the mask. You can change this by either specifying specific files you want to delete, or by renaming the mask files prior to removing the other output files.
- Best practices are to select the "cellsize" parameter such that there are ~ 5 pixels across the synthesized beam. You may need to make a test image first or iterate your imaging to ensure that this condition holds.
- The weighting of u-v data is a free parameter that can have a substantial impact on the output image. Try alternating between "natural" and "robust" weighting and manipulating the robust parameter.
Script 2: Line Imaging
Imaging a spectral line data set adds a couple of extra complications to the process. First, the third dimension needs to be specified. This can be done by setting "mode" to velocity, frequency, or channels. You can specify which channels to image by setting a starting value, step, and number of planes to image. CASA will attempt to average appropriately to meet your request, though obviously it cannot do better than the underlying data allow. In addition, CASA does not know a priori what line you are imaging. You need to specify the rest frequency of the line. A few notes along these lines:
- The provided script incorporates a priori knowledge of the mapping between lines and spectral windows. You would know this for your data from having written the observing script or by reading the proposal (or by combining listobs, a resource like Splatalogue, and your knowledge of astrophysics).
- We also provide rest frequencies for the relevant lines within the script. Again, at the moment you need to know these a priori (though of course you can simply image the whole cube, particularly for the NGC 3256 dataset, and work backwards from the frequencies at which you see lines to guess the molecular transition).
- As you run the interactive clean for three dimensional data sets, you will notice that the clean mask is also now three dimensional. The path of least resistance here is to specify a two dimensional clean mask that applies to all planes by flipping the "This Channel" button in the highlighted green section of the viewer to "All Channels" before you create the clean box(es). Note that if you do not do this, your clean mask will only apply to the image plane you are currently viewing. Alternatively, you can carefully construct a channel-by-channel mask. If you do so, be sure to make a copy because you probably don't want to do it twice!
Second, as we have already seen, each of the science verification data sets contains a mixture of continuum and line emission. In most cases we would like to separate out the line emission for analysis. This can be done using standard image processing techniques after forming the cube (e.g., see imcontsub), but the more robust and standard way to approach this is to carry out continuum subtraction in the original u-v data set. This is done using the uvcontsub task, which will produce a new, continuum-subtracted copy of the measurement set with the extension ".contsub". To apply this successfully, we need to tell uvcontsub what region of the spectrum we think is free of line emission, and what order polynomial we would like it to fit. The script includes a call to plotms that heavily averages to produce a single spectrum useful for identifying line-free channels.
- The CASA guides mix calls to uvcontsub and uvcontsub2. This is a historical artifact. Since CASA version 3.3, uvcontsub is the version of the task to use. We are using CASA version 4.0.1.
- Notice that uvcontsub is spinning out a new measurement set to hold the continuum-subtracted version of the data. You want to image this ".contsub" version of the data. Note that you don't want to image the continuum produced in this way. It is far better to just image the line-free channels of the data.
Other considerations for line imaging:
- In practice please notice that we are hand feeding you a lot of hard numbers here. The practical case of line imaging is likely to have a fair amount of trial and error as you find the right parameters to build your cube.
Script 3: Mosaic Imaging
The next dimension that we will add will be mosaicking. We turn on CASA's mosaicking capability by switching the clean parameter "imagermode" to "mosaic". Though not unique to this imaging mode, several additional considerations arise for this case:
- You can manually specify the center about which CASA will image using the "phasecenter" parameter. This can be set to either a field name (useful for mosaics and used here) or a specific RA, Dec coordinate pair.
- You can tell clean to mask the output image so that only parts of the sky covered by the primary beam are imaged. This is done via the "minpb" parameter, which specifies the minimum primary beam response necessary to be included in the output image (e.g., "minpb=0.5" would image only inside the FWHM of the primary beam; we use the default 0.2).
- You can also request that the final image be corrected for the primary beam via the "pbcor" parameter. Note that this is not done by default.
Note that mosaicking can be combined with spectral line imaging (see the case of the AntennaeBand7 CASA guide) and, eventually, with higher nterms values for "mfs" (though this latter capability is still under development). Both applications can be very time consuming, making them less than ideal for a hands-on tutorial. The syntax and decisions extrapolate directly from the examples that we have stepped through.
Script 4: Multifrequency Synthesis with Spectral Indices
For both ALMA and the EVLA, it can easily be the case that the intensity of the source varies substantially across the observed bandpass (spectral indices can reach 4 for ALMA and bandpasses for both telescopes can be very wide). The multifrequency synthesis mode used for continuum images can handle such emission by adding a (resolved) spectral index to the model. This mode is activated by specifying a value higher than 1 for the "nterms" expandable parameter for mode "mfs" (e.g., here we will use "nterms=2").
This script demonstrates the application of this capability and its output. Please note the important caveat that the internal amplitude calibration of the TW Hya data are not ideal, though the calibration between bands should be fine. Thus an internal application, e.g., to Band 7 alone should be viewed as instructive rather than yielding science quality results. You can use plotms to see that the internal calibration may be problematic (e.g., for the Tw Hydra Band 7 data, the individual spectral windows jump up and down slightly in amplitude).
A few considerations to note:
- clean accepts a list of input files. Here you can chose to combine 1, 2, or all 3 of the TW Hydra data sets. Clean will image them together. Note that this is not a uniquely multifrequency application, it can also be used to combine, e.g., data from different days or observing runs.
- The output of mfs with nterms >= 2 is a bit different than for nterms = 1. You now have both an intensity image (labeled '.tt0') and a spectral index file (labeled '.alpha'). The intensity image occurs at the "reference" frequency, which can be specified in clean via the "reffreq" parameter and does not have to be a frequency that you observed (e.g., it may lie at an intermediate frequency between bandpasses). The spectral index image describes the local alpha where intensity \propto frequency^alpha (so that overall, I(nu) = I(nu_ref) * (nu / nu_ref)^(alpha) where nu_ref is the reference frequency). You can view both the intensity at the reference frequency and the spectral index with the viewer.
- You can specify nterms higher than 2 to account for spectral curvature beyond a mere power law. This isn't likely to be the limiting factor in the immediate future.
- Notice that MFS is willing to combine data from different bands. This is potentially dangerous if the u-v coverage among bands is very different and the spectral index as a function of position has complex behavior, but with matched u-v coverage between bands this can give you a very powerful lever arm on the resolved (or unresolved) spectral index of your source. Experiment with combining two or all three bands and have a look at the results.
- You may need to be a bit careful with stopping thresholds here. These operate on the residual at your reference frequency and the spectral index can be pretty steep. A perfectly reasonable stopping threshold at Band 7 may not work when you combine all three bands.