CASA Guides:VLBA Basic Phase-referencing Calibration and Imaging
This CASA Guide is for Version 6.4.0 of CASA. If you are using a later version of CASA and this is the most recent available guide, then you should be able to use most, if not all, of this tutorial.
Overview
This CASA Guide describes the procedure for calibrating a phase-referenced VLBA observation of the radio galaxy J1203+6031 (IVS 1200+608, ICRF3 J120303.5+603119). The data were taken specifically for this tutorial. The observation made use of the DDC observing personality, using dual polarization with 4 spectral windows per polarization. Each spectral window is has a bandwidth of 64 MHz, and is divided into 128 spectral channels.
This tutorial will focus on calibrating the data and creating continuum (Stokes I) images.
How to Use This CASA Guide
There are a number of possible ways to run CASA, described in more detail in Getting Started in CASA. In brief, there are at least three different ways to use CASA:
- Interactively examining task inputs. In this mode, one types taskname to load the task, inp to examine the inputs, and go once those inputs have been set to your satisfaction. Allowed inputs are shown in blue and bad inputs are colored red. The input parameters themselves are changed one by one, e.g., selectdata=True. Screenshots of the inputs to various tasks used in the data reduction below are provided, to illustrate which parameters need to be set. More detailed help can be obtained on any task by typing help taskname. Once a task is run, the set of inputs are stored and can be retrieved via tget taskname; subsequent runs will overwrite the previous tget file.
- Pseudo-interactively via task function calls. In this case, all of the desired inputs to a task are provided at once on the CASA command line. This tutorial is made up of such calls, which were developed by looking at the inputs for each task and deciding what needed to be changed from default values. For task function calls, only parameters that you want to be different from their defaults need to be set.
- Non-interactively via a script. A series of task function calls can be combined together into a script, and run from within CASA via execfile('scriptname.py'). This and other CASA Tutorial Guides have been designed to be extracted into a script via the script extractor by using the method described at the Extracting_scripts_from_these_tutorials page. Should you use the script generated by the script extractor for this CASA Guide, be aware that it will require some small amount of interaction related to the plotting, occasionally suggesting that you close the graphics window and hitting return in the terminal to proceed. It is in fact unnecessary to close the graphics windows (it is suggested that you do so purely to keep your desktop uncluttered).
If you are a relative novice or just new to CASA, it is strongly recommended to work through this tutorial by cutting and pasting the task function calls provided below after you have read all the associated explanations. Work at your own pace, look at the inputs to the tasks to see what other options exist, and read the help files. Later, when you are more comfortable, you might try to extract the script, modify it for your purposes, and begin to reduce other data.
Obtaining the Data
This Guide is intended to cover the entire process one would follow for calibrating their own VLBA observation. Therefore, we will start with a FITS-IDI file rather than a Measurement Set. The FITS-IDI file for this Guide is: PROVIDE LINK TO IDIFITS FILE HERE (file size 3.8 GB).
If users prefer, they may download the data from the NRAO Archive. In the archive search inputs, enter the project code "TL016" and look for the file VLBA_TL016B_tl016b_BIN0_SRC0_1_220307T210851.idifits. Once you have downloaded the FITS-IDI file, it may be useful to change the filename to "TL016B.idifits".
The Observation
Before diving into the calibration, it is always a good idea to look over the observing log to make check for notes from the operators that can inform us about potential issues with the data (missing stations, bad weather, etc.). These logs are always emailed to the PI's of an observation, but you can also access them later from the NRAO's vlbiobs fileserver. To locate an observing log, first find the directory for observing month and the last two digits of the year (for the observation used in this Guide, that directory is feb22 for February 2022). Once inside the proper month+year directory, look for the project code (in this case, tl016b). Look for a file named <project code>log.vbla (tl016blog.vlba).
The observing log for this particular observation looked like this:
VERY LONG BASELINE ARRAY OBSERVING LOG -------------------------------------- Project: TL016B Observer: Linford, J. Project type: VLBA Obs filename: t;016b.vex Date/Day: 2022FEB21/052 Ants Scheduled: SC HN NL FD LA PT KP OV BR MK =UT-Time===Comment===============================================MF#===%AD==AMD= Operator is Betty Ragan 0559 Begin 0559 %SC raining 0559 %KP windy 1327 End. Downtime Summary: Total downtime : 0 min Percentage downtime of observing: 0.0% Average downtime per hour : 0.0 min Total scheduled observing time (# Antennas): 4480 min (10) Notes: * = Entries where data was affected. % = Entries where data may have been affected. & = Entries where the site tech was called out. WEA = Weather entries. MF# = Maintenance form or major downtime category associated with a problem. %AD = The percentage of an antenna affected by a problem. AMD = Total antenna-minutes downtime for a problem. Tsys = System Temperature (TP/SP x Tcal/2) ACU = Antenna Control Unit FRM = Focus/Rotation Mount RFI = Radio Frequency Interference VME = Site control computer CB = Circuit Breaker vclock = Program that compares site clock time to a standard.
There are no major issues reported by the operators for this observation and all ten antennas participated for the entire time (no downtime). However, there was rain at Saint Croix (SC) and it was windy at Kitt Peak (KP). We'll need to keep that in mind as we proceed with the calibration. We should pay special attention to SC and KP as we inspect the data and the calibration solutions we generate.
NOTE: This is an exceptionally good observing log! Most observations will have at least a small amount of downtime for various reasons.
Creating the Measurement Set
Before beginning our data reduction, we must start CASA. If you have not used CASA before, some helpful tips are available on the Getting Started in CASA page. Remember to start CASA in the directory containing the data.
Once you have CASA up and running, it is time to get the data into a format that CASA can use. Unlike VLA data, a VLBA observation is only available as a FITS-IDI file and cannot be downloaded as a CASA Measurement Set. So, the first step in calibrating a VLBA observation with CASA is to create a Measurement Set from the FITS-IDI file. To do this, we will use the task importfitsidi:
# In CASA
importfitsidi(fitsidifil='TL016B.idifits', vis='tl016b.ms', constobsid=True, scanreindexgap_s=15)
The scanreindexgap_s parameter is used to reconstruct scan boundaries in those cases where sources do not change between scans. In general, it is good to set scanreindexgap_s to some non-zero number to help CASA properly organize the scan list. The recommended value is 15, but shorter values may work as well (although you probably don't want to go much shorter than about 5 seconds). If you find that the resulting MS contains too few scans, run importfitsidi again with scanreindexgap_s set to a smaller number. If your MS has too many scans, especially multiple scans on the same source when you think it should just be one scan, run importfitsidi again with scanreindexgap_s set to a larger number.
Inspecting the Data
Now that we have a Measurement Set, it is time to look over the data, identify a good reference antenna, and find a good time range to use for calibrating the single band delay.
The Observation Summary
It will be useful later to have the basic information about the observation. The task listobs will return a list of all the scans, the sources observed, which stations were used, and the frequency setup. It is possible to run listobs in two ways: printing information in the CASA logger, or saving the information to a file.
To simply display the information in the CASA logger:
#In CASA
listobs(vis='tl016b.ms')
You should see the listobs output in the CASA logger window:
PASTE THE LISTOBS OUTPUT HERE
NOTE: You can also assign the listobs output to a python dictionary (e.g., "obs_dict") by typing "obs_dict = listobs(vis='tl016b.ms')".
It is usually useful to have a copy of the listobs output in a file that you can refer to later. To save the listobs output to a file named "tl016b_listobs.txt':
#In CASA
listobs(vis='tl016b.ms', listfile='tl016b_listobs.txt')
Identifying a Good Reference Antenna
#In CASA
plotms(vis='tl016b.ms.ms', xaxis='frequency', yaxis='phase', field='4C39.25', scan='!!SCAN!!', correlation='ll', iteraxis='baseline', coloraxis='spw')
Identifying a Good Time Range for the Single Band Delay
#In CASA
plotms(vis='tl016b.ms', xaxis='frequency', yaxis='amp', field='4C39.25', antenna='!!REFANT!!', scan='!!SCAN!!', correlation='rr,ll', iteraxis='baseline', coloraxis='corr')
Flagging Data
Quack
VLBA observations often include a little bit of bad data at the beginnings of the scans, and sometimes at the ends of scans. An easy way to deal with this is to "quack" the data. Quacking is a completely optional step, and you should only do it if you see evidence for bad data at the beginnings or ends of scan.
In CASA, you can quack your data using the flagdata task and setting mode='quack'. The amount of data to be flagged is controlled by the quackinterval parameter, which sets the time interval in seconds.
To flag the first 4 seconds of each scan:
#In CASA
flagdata(vis='tl016b.ms', mode='quack', quackinterval=4.0, quackmode='beg', quackincrement=True)
To flag the last 4 seconds of each scan:
#In CASA
flagdata(vis='tl016b.ms', mode='quack', quackinterval=4.0, quackmode='endb', quackincrement=True)
For those who are interested: There is some uncertainty about the origins of the term "quack". However, discussions with people who were working at NRAO in the late 1970s indicates it has nothing to do with waterfowl. Instead, "quack" refers to an unscrupulous/incompetent physician who treats the symptoms of a disease without treating the disease itself. The original QUACK routine was written for the VLA DEC-10 computers and flagged the beginnings of scans because they often contained bad data, but nobody could figure out what was causing the bad data.
Automated Flagging
Flagging "By Hand"
Calibrating the Data
Amplitude Corrections from Autocorrelations
Determine the amplitude corrections from the autocorrelations with accor.
#In CASA
accor(vis='tl016b.ms', caltable='tl016b.accor', solint='30s')
NOTE: This step is not required for EVN data, because the EVN correlator performs it during correlation.
Inspect the tl016b.accor solution table with plotms.
#In CASA
plotms(vis='tl016b.accor', xaxis='time', yaxis='amp', iteraxis='antenna')
Look for any large outliers.
It should be noted that the AIPS VLBA utility script VLBACCOR smooths the autocorrelation corrections by default (with a smoothing time of 30 minutes). It is possible to do this smoothing in CASA 6.3 and later using the smoothcal task.
#In CASA
smoothcal(vis='tl016b.ms, tablein='tl016b.accor', caltable='tl016b_smooth.accor', smoothtype='median', smoothtime=1800.0)
Remember to checked the smoothed solutions with plotms to make sure it was an improvement.
A Priori Calibration
Unlike the VLA, the VLBA cannot rely on bootstrapping absolute flux density calibration from a well-modeled calibrator. Instead, the VLBA relies on a combination of the system temperature and the known gain curve of the antennas (how the gain of the antenna changes with elevation). Both the system temperatures and gain curves are included in the FITS-IDI file. To get this information into a form that CASA can use for calibration, we use the gencal task to generate calibration tables.
System temperature:
gencal(vis='tl016b.ms', caltable='tl016b.tsys', caltype='tsys', uniform=False)
Check the system temperature table with plotms.
#In CASA
plotms(vis='tl016b.tsys', xaxis='time', yaxis='amp', iteraxis='antenna')
Make sure to also plot the solutions with xaxis='freq' .
Gain curve:
#In CASA
gencal(vis='tl016b.ms', caltable='tl016b.gcal', caltype='gc')
For this observation at 5 GHz, the gain curve is not very interesting, so we will not bother looking at the solutions now (although you can if you really want to). If your observation is at 12 GHz or higher, you will probably want to inspect the gain curve table with plotms.
Instrumental Delay Calibration
Solve for the instrumental delays by using fringefit on a bright source (the "fringe finder"). In our case, the fringe finder is 4C39.25. Set the timerange to the time span you identified while inspecting the data earlier.
#In CASA
fringefit(vis='tl016b.ms', caltable='tl016b.sbd', field='4C39.25', timerange='!!TIME RANGE!!', solint='inf', zerorates=True, refant='REFANT', minsnr=10, gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys'], interp=['nearest', 'nearest', 'nearest,nearest'], parang=True)
Ideally, the SNR for each station should be very high (>10). Watch the logger for any reports of low SNR and failures to converge on a solution.
Apply the instrumental delay corrections using applycal.
#In CASA
applycal(vis='tl016b.ms', gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd>'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest'], parang=True)
NOTE: In applycal, the order in which you specify the list for gaintable sets the order for both interp and spwmap. If you change the order of gaintable, be sure to also change the order of interp and spwmap! Also, if you smoothed any of the solutions, be sure to use the appropriate filename for the smoothed table.
Check that applying the solutions resulted in improvements:
#In CASA
plotms(vis='tl016b.ms', field='4C39.25', xaxis='frequency', yaxis='phase', ydatacolumn='corrected', timerange='!!SAME AS SBD STEP!!', correlation='rr,ll', antenna='*&*', iteraxis='antenna' coloraxis='antenna2')
Global Fringe Fitting
Now, we will solve for the time and frequency-dependent effects in phase using fringefit, also known as "global fringe fitting" (or sometimes "multi-band delay", if you combine all of the bands). We will need to pick a solution interval that is appropriate for our data. It should be at least 10 seconds, and no longer than the scan length on the phase reference calibrator. For this observation, we will use XX seconds.
For refant, enter a list of antennas to try as the reference antenna. The preferred refant should be listed first, followed by the second choice, then third choice, and so on. It is not recommended to include Mauna Kea (MK) or Saint Croix (SC) in the list, unless the phase reference calibrator is very bright on the longest baselines. Set 'field' to the fringe finder and phase reference calibrator.
#In CASA
fringefit(vis='tl016b.ms', caltable='tl016b.mbd', field='4C29.25, J1154+6022', solint='!!SOLINT!!', minsnr=5, zerorates=False, refant='FD,PT,LA,KP,OV,NL,BR,HN !!CHECK REFANT LIST!!', gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest'], parang=True)
This step may take quite a while, so this is a good opportunity to go get a tasty beverage.
When the fringefit task is done, check the logger for the solution statistics ("expected/attempted/succeeded"). You want all three numbers to be the same, or at least very similar. Also, look for instances when the SNR was beloew your threshold (minsnr) and times when it took many iterations to get a good fit. You may need to try longer solution intervals to get the global fringe fit to work optimally.
After fringefit has successfully completed and you are satisfied that the number of solutions is appropriate, take a look at the solutions with plotms.
#In CASA
plotms(vis='tl016b.mbd', xaxis='time', yaxis='phase', iteraxis='antenna')
Use the GUI to switch the yaxis between 'delay', 'phase', and 'delayrate'. The delay and phase solutions should both vary smoothly with time. If they do not, you may need to smooth the table before applying it. The rates should be centered on zero with some scatter.
When you are confident that the global/multi-band solutions are good, apply them with applycal.
#In CASA
applycal(vis='tl016b.ms', gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd', 'tl016b.mbd'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest', 'linear'], parang=True)
It will probably take a while (about 5 minutes or longer) for applycal to run this time.
Take a look at the calibrated data with plotms to make sure the corrections are improving the phases.
#In CASA
plotms(vis='tl016b.ms', SOMETHING, SOMETHING ELSE, SOMETHING EVEN MORE)
Bandpass Calibration
Now we will correct for the shape of the bandpass using the bandpass task. This step requires a very bright source (perferably >1 Jy on all baselines), so we will use our fringe finder 4C39.25. However, unlike when we solved for the instrumental delay (above), we will use all of the scan on the source.
#In CASA
bandpass(vis='tl016b.ms', caltable='tl016b.bpass', field='4C39.25', solint='inf', refant='!!REFANT!!', solnorm=True, bandtype='B', gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd', 'tl016b.mbd'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest', 'linear'], parang=True)
Inspect the solutions with plotms.
#In CASA
plotms(vis='tl016b.bpass', xaxis='frequency', yaxis='amp', iteraxis='antenna')
While looking over the solutions, also make sure to check the phases (uses the GUI to set yaxis='phase').
Apply the solutions with applycal.
#In CASA
applycal(vis='tl016b.ms', gaintable=['tl016b_smooth.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd', 'tl016b.mbd', 'tl016b.bpass'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest', 'linear', 'linear,linear'], parang=True)
It will probably take a while for applycal to run again, since you are also applying the solutions from the global fringe fitting.
After aplpying the bandpass solutions, take a look at the calibrated data with plotms.
#In CASA
plotms(vis='tl016b.ms', xaxis='frequency', yaxis='amp', ydatacolumn='corrected', field='4C39.25', antenna='*&*', correlation='rr,ll', iteraxis='antenna', coloraxis='spw')
The bandpass calibration often does not perfectly calibrate the channels at the beginning and end of each spectral window. It is often a good idea to get rid of the edge channels that are not well-calibrated. If you notice that the edges of the spectral windows look at bit nasty (much higher than the rest of the band), feel free to flag those channels. This flagging can be done in plotms, but it is often easier (and more reliable) to do it in with flagdata.
For our data, we will flag the first and last 3 channels of each spw.
#In CASA
flagdata(vis='tl016b.ms', spw='0~2;125~127')
Many experienced VLBA observers will not have any second thoughts about cutting out 8 or 12 channels on each side of the spw for continuum observations. If you start with flagging 3 channels on each side and think the amplitude vs frequency plots still look pretty gross, feel free to flag some more.
Final Amplitude Scaling and Flux Calibration
Any VLBA observation with wide bandwidths (>256 MHz), which is any observation done a bit rate of 2 Gbps or more, will require one extra calibration step at this point. The flux density scale for wideband VLBA observations can be off by up to 30% (although it usually only off by a few percent) if the calibration does not correctly account for the wide bandpasses. To do this, you need to run accor again after the bandpass calibration has been applied. The AIPS task that was developed to address this issue is called ACSCL. For more details on this topic, and how it was handled in AIPS, see VLBA Scientific Memo #37.
Prior to running accor this time, it is strongly recommended that users inspect the calibrated data and determine whether any channels will need to be excluded from imaging or other post-processing. Edge channels may need to be excluded if the bandpass calibration did not properly correct the band at the edges. From VLBA Scientific Memo #37, the standard recommendation is to use the inner ~75% of channels for PFB observations and the inner ~89% of channels for DDC observations. Any channels that are suspected to contain RFI should also be excluded. The actual channels used will depend on the individual observation and science goals.
Final re-scaling of the auto-correlation amplitudes with accor. NOTE: DO NOT APPLY THE SYSTEM TEMPERATURE OR GAIN CURVE TABLES WHEN DERIVING THESE SOLUTIONS.
#In CASA
accor(vis='tl016b.ms', spw='*:7~121', caltable='tl016b.acscl', solint='2min', gaintable=['tl016b.accor', 'tl016b.sbd', 'tl016b.mbd', 'tl016b.bpass'], interp=['nearest', 'nearest', 'linear', 'linear,linear'])
NOTE: Use all the channels that will be used for imaging or other post-processing.
The AIPS VLBA utility script VLBAAMP smooths the autcorrelation corrections by default in exactly the same way as VLBACCOR. We will replicate the AIPS method by using the \texttt{smoothcal} task to smooth our .acscl table.
#In CASA
smoothcal(vis='tl016b.ms, tablein='tl016b.acscl', caltable='tl016b_smooth.acscl', smoothtype='median', smoothtime=1800.0)
Apply the final amplitude solutions with applycal. You should apply the system temperature and gain curve tables in this step.
#In CASA
applycal(vis='<your filename>.ms',
field='',
gaintable=['tl016b.accor', 'tl016b.gcal', 'tl016b.tsys', 'tl016b.sbd', 'tl016b.mbd', 'tl016b.bpass', 'tl016b_smooth.acscl'], interp=['nearest', 'nearest', 'nearest,nearest', 'nearest', 'linear', 'linear,linear', 'nearest'], parang=True)
Congratulations! The data should be mostly calibrated at this point. At the very least, you should be able to make images of the calibrators.
Split Out Calibrated Data
It is generally recommended to split the measurement set after the initial calibration is complete. If you have multiple science targets, you should create a new MS for each science target + phase reference calibrator pair by setting the field paremeter to the appropriate values. Even if your observation only involved a single target, it is a good idea to split the MS once the initial calibration is complete. This will preserve the initially-calibrated data in case you make a mistake in any of the next steps and need to start over. Think of it as a "save point" in a video game (do you really want to have to go back to the beginning of the game when you could just start from the beginning of the current level?).
Split the calibrated MS using split.
#In CASA
split(vis='tl016b.ms', outputvis='tl016b_cal1.ms', field='J1154+6022,J1203+6031', spw='*:7~121', antenna='*&*', datacolumn='corrected')
You can name the new MS whatever you want, but using a naming scheme that keeps track of where it was in the calibration process will make your life easier if you make mistakes down the line. For this tutorial, we will use "tl016b_cal1.ms" to indicate it is where we ended up after the first calibration steps. Setting antenna='*&*' will leave the autocorrelations out of our new MS (we don't need them anymore). We should not need 4C39.25 for any of the next steps, so we will not bother to keep that source in our new MS. Also, note that we have selected only the spectral channels that we used during the final amplitude scaling step.
Self-Calibration of Phase Reference Calibrator
Despite our best efforts with the initial calibration, VLBA osbervations will almost always require additional steps to improve the calibration. We will start by generating a model of the phase reference calibrator, and use that model to refine the calibration. This is known as "self-calibration".
Tracking Improvement
Imaging the Calibrator
#In CASA
tclean(vis='tl016b_cal1.ms', field='J1154+6022', imagename='J1154_sc1', imsize=[640], cell=['0.0003arcsec'], stokes='I', deconvolver='clark', weighting='natural', niter=1000, interactive=True, savemodel='modelcolumn')
Phase Self-Calibration
First, we will refine the delays.
#In CASA
gaincal(vis='tl016b_cal1.ms', field='J1154+6022', caltable='tl016b_cal1.dcal', solint='inf', refant='PT', minblperant=3, gaintype='K', calmode='p', parang=False)
Next, we will refine the phases.
#In CASA
gaincal(vis='tl016b_cal1.ms', field='J1154+6022', caltable='tl016b_cal1.pcal', solint='20s', refant='PT', minblperant=3, gaintype='G', calmode='p', gaintable=['tl016b_cal1.dcal'], interp=['linear'], parang=False)
Apply the phase self-calibration soultions to the phase reference calibrator.
#In CASA
applycal(vis='tl016b_cal1.ms', field='J1154+6022', gaintable=['tl016b_cal1.dcal',tl016b_cal1.pcal'], interp=['linear','linear'], parang=False)
Make a new image after the improved phase calibration.
#In CASA
tclean(vis='tl016b_cal1.ms', field='J1154+6022', imagename='J1154_sc2', imsize=[640], cell=['0.0003arcsec'], stokes='I', deconvolver='clark', weighting='natural', niter=1000, interactive=True, savemodel='modelcolumn')
Amplitude Self-Calibration
#In CASA
gaincal(vis='tl016b_cal1.ms', field='J1154+6022', caltable='tl016b_cal1.apcal', solint='inf', refant='PT', minblperant=4, gaintype='G', calmode='ap', solnorm=True, gaintable=['tl016b_cal1.dcal','tl016b_cal1.pcal'], interp=['linear','linear'], parang=False)
Apply the amplitude solutions to the phase reference calibrator.
#In CASA
applycal(vis='tl016b_cal1.ms', field='J1154+6022', gaintable=['tl016b_cal1.dcal',tl016b_cal1.pcal','tl016b_cal1.apcal'], interp=['linear','linear','linear'], parang=False)
Apply Calibration to Science Target
#In CASA
applycal(vis='tl016ac_cal1.ms', field='J1203+6031', gaintable=['tl016ac_cal1.dcal',tl016ac_cal1.pcal','tl016ac_cal1.apcal'], interp=['linear','linear','linear'], applymode='calonly', parang=False)
Split Out Science Target
#In CASA
split(vis='tl016b_cal1.ms', outputvis='tl016b_cal2.ms', field='J1203+6031', datacolumn='corrected')
Image the Science Target
#In CASA
tclean(vis='tl016b_cal2.ms', field='J1203+6031', imagename='J1203_im1', imsize=[640], cell=['0.0003arcsec'], stokes='I', deconvolver='clark', weighting='natural', niter=1000, interactive=True, savemodel='modelcolumn')