VLA CASA Pipeline-CASA4.5.3
This guide is designed for CASA 4.5.3
Introduction
When VLA observations are complete, the raw data need to be calibrated for scientific applications. This is achieved through various steps, as explained in the VLA CASA tutorials. The different calibration procedures are also bundled in a general VLA calibration pipeline that is described on the VLA pipeline webpage. At NRAO, the pipeline is now executed on every science scheduling block (SB) that the VLA observes. At this time, scientific target imaging is not part of the VLA pipeline. Manual imaging steps, however, are explained in the VLA CASA tutorials.
There are currently two maintained versions of the VLA pipeline: A calibration pipeline integrated in CASA, and an external, scripted pipeline. This VLA pipeline CASA tutorial guides through the version that is integrated in CASA. It is developed along with the ALMA pipeline and aims to use similar procedures and outputs (documentation for ALMA is available through the almascience.org portal/Documents & Tools. Details on the scripted pipeline can be found on the VLA scripted pipeline webpage.
The VLA pipeline has been developed to work for data taken in all 1-50GHz VLA frequency bands and requires minimal manual intervention. All calibration steps are applied to all data; this implies that simplicity and robustness currently has priority over speed. Pipeline runs can take anywhere from a few hours to several days, with a typical VLA SB being processed within the time span of about a day.
The pipeline was introduced in May 2012 and usually works well for all data taken thereafter. It may not work without modifications for data taken earlier and for such observations we recommend adjusting the scripted pipeline or to perform the calibration steps manually.
Pipeline Requirements
The VLA pipeline has been developed for Stokes I continuum data with a range of spectral windows (spws) that are typically 128MHz wide. Nevertheless, it usually also performs well on other setups, although it is currently not tailored for special needs. In the future, we will provide a reprocessing interface that allows the user to adapt the pipeline to the specifics of their observations including spectral line, and polarization. In the following we will also explain how to adapt the pipeline, e.g. for spectral line setups.
The pipeline relies on the correct setting of the scan intents. We therefore recommend every observer ensures that the scan intents are correctly specified in the Observation Preparation Tool (OPT) during the preparation of the observing script (see OPT manual for details). In particular, the pipeline requires the intents CALIBRATE_FLUX to mark flux density calibration scans (toward one of the standard VLA calibrators 3C48, 3C138, 3C147, or 3C286), CALIBRATE_AMPLI and CALIBRATE_PHASE for the temporal complex gain/phase calibration, and CALIBRATE_BANDPASS for the scan that is being used to obtain the bandpass calibration (the CALIBRATE_FLUX flux density calibrator will be used when CALIBRATE_BANDPASS is not specified). The pipeline also currently requires a signal-to noise of >~3 for each spectral window of a calibrator per integration (for each channel of the bandpass).
Pipeline Execution
NRAO-Initiated Automatic Pipeline Runs
Every science schedule block (SB) executed at the VLA will be batch pipeline processed. NRAO uses a pipeline version that is packaged with CASA and that is also available to outside users (see the CASA download page for the current (and older) pipeline versions). At NRAO, we will always execute the standard pipeline, which implies that it is optimized for Stokes I continuum processing independent of the actual observation setup. A user may therefore decide to run the pipeline after making appropriate modifications.
Once an SB has been processed, the PI of the project will receive an email that pipeline calibrated data are available and can be requested via the NRAO helpdesk. For ~15 days, the calibrated MeasurementSet (MS) will be stored for download. After that period, we delete the calibrated MS but retain the calibration and flagging tables and the weblog. The user may then request these products and apply them to the raw MS (downloaded from the archive) to obtain calibrated visibilities (see the VLA pipeline webpage for more details).
The user should inspect the weblog and the calibrated data from the VLA pipeline results carefully. Usually, some additional flagging and reprocessing will be required for better results. Upon request through the NRAO helpdesk an NRAO scientist can perform a quality assessment of a pipeline results as well.
Starting the Pipeline Manually
Pipeline processing can be quite computing intensive. On the CASA Hardware Requirements page, we provide some recommendations for a suitable computing infrastructure. If you would like to use the NRAO lustre/cluster system, please request an account through the NRAO helpdesk.
The VLA pipeline webpage provides details on how to execute the VLA pipeline. To start with, a CASA version with integrated pipeline heuristic code is required and can be downloaded from the CASA webpages.
We also recommend to download the VLA raw data from the NRAO archive in the form of an SDM-BDF, the raw VLA archive data format (although MeasurementSets will also work).
If you want to run the pipeline for the data that is shown in this giude, search the NRAO archive for the file id: 13A-398.sb17165245.eb19476558.56374.213876608796
To include the pipeline heuristic tasks, start CASA with the --pipeline option:
# In a Terminal
casa --pipeline
At NRAO one can start the current default pipeline version via:
# In a Terminal
casa-pipe
Next, at the CASA prompt, import the VLA pipeline recipes like:
# In CASA
import pipeline.recipes.hifv as hifv
(other, specialized recipes may be available as well)
The actual execution of the pipeline on the SDM, in our example 13A-398.sb17165245.eb19476558.56374.213876608796, will be initiated like:
# In CASA
hifv.hifv(['13A-398.sb17165245.eb19476558.56374.213876608796'])
The pipeline will now start processing the data. Depending on the data size and structure, processing times range somewhere between a few hours and several days with an average of about a day.
Pipeline Output
VLA pipeline output includes:
- An MS with calibrated visibilities in the CORRECTED_DATA column that can be used for subsequent imaging.
- Calibrator images for all spws (files oussid* in the main directory; see the descriptions of the tasks hifv_makeimlist and hifv_makeimages below).
- All calibration tables and an updated MS.flagversions file that contains all flag backups'.
- A weblog that is accessible via pipelineXXXX/html/index.html, where the XXX stands for the pipeline execution time stamp (multiple pipeline executions will result in multiple weblogs).
- The casapyXXX.log CASA logger messages in the same directory.
- casa_pipescript.py, the script with the actually executed pipeline heuristic sequence and parameter (see below).
- casa_commands.log, which contains the actual CASA commands that were generated by the pipeline heuristics (see below).
- The listobs output is available under pipelineXXXX/html/sessionSession_default/<MSname>.listobs.txt' 'and contains the characteristics of the observations (temporal, spatial, spectral setup, antenna positions, and general information).
Calibration Tables
The final calibration tables of the pipeline are (where <SDM> is a placeholder for the SDM name):
<SDM>.ms.hifv_priorcals.s5_3.gc.tbl : Gaincurve <SDM>.ms.hifv_priorcals.s5_4.opac.tbl : Opacity <SDM>.ms.hifv_priorcals.s5_5.rq.tbl : Requantizer gains <SDM>.ms.hifv_priorcals.s5_6.ants.tbl : Antenna position offsets <SDM>.ms.finaldelay.k : Delay <SDM>.ms.finalBPcal.b : Bandpass <SDM>.ms.averagephasegain.g : Temporal Phase offsets <SDM>.ms.finalampgaincal.g : Flux calibrated Temporal Gains <SDM>.ms.finalphasegaincal.g : Temporal Phases
casa_pipescript.py
VLA pipeline heuristic tasks start with hifv for 'heuristics, interferometry, vla'. The pipeline sequence of the pipeline heuristic steps are listed in the 'casa_pipescript.py' that is located in the 'pipelineXXXX/html' directory. For our example, 'casa_pipescript.py' has the following structure:
__rethrow_casa_exceptions = True h_init() try: hifv_importdata(ocorr_mode='co', vis=['13A-398.sb17165245.eb19476558.56374.213876608796'], createmms='automatic', asis='Receiver CalAtmosphere', overwrite=True) hifv_hanning(pipelinemode="automatic") hifv_flagdata(intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*', flagbackup=False, scan=True, baseband=True, clip=True, autocorr=True, hm_tbuff='1.5int', template=False, online=True, tbuff=0.0, fracspw=0.05, shadow=True, quack=True, edgespw=True) hifv_vlasetjy(fluxdensity=-1, scalebychan=True, reffreq='1GHz', spix=0) hifv_priorcals(pipelinemode="automatic") hifv_testBPdcals(pipelinemode="automatic") hifv_flagbaddef(pipelinemode="automatic") hifv_checkflag(pipelinemode="automatic") hifv_semiFinalBPdcals(pipelinemode="automatic") hifv_checkflag(checkflagmode='semi') hifv_semiFinalBPdcals(pipelinemode="automatic") hifv_solint(pipelinemode="automatic") hifv_fluxboot(pipelinemode="automatic") hifv_finalcals(pipelinemode="automatic") hifv_applycals(pipelinemode="automatic") hifv_targetflag(pipelinemode="automatic") hifv_statwt(pipelinemode="automatic") hifv_plotsummary(pipelinemode="automatic") hif_makeimlist(nchan=-1, calmaxpix=300, intent='PHASE,BANDPASS') hif_makeimages(masklimit=4, noise='1.0Jy', subcontms=False, target_list={}, parallel='automatic', maxncleans=1, weighting='briggs', tlimit=2.0, robust=-999.0, npixels=0) finally: h_save()
This is in fact a standard casa_pipescript.py file that can be used for pipeline processing in general after inserting the correct filename in hifv_importdata.
The pipeline run can be modified by adapting this script, e.g. by commenting out individual steps or by providing different parameters (see the inline help for the parameters of each task). The script can then be (re-)executed via:
# In CASA
execfile('casa_pipescript.py')
We will use this method, e.g. to modify the script after being adjusted for spectral line processing (see below).
General modifications to the script include setting __rethrow_casa_exceptions = False to suppress CASA error messages in the weblog and h_init(weblog=False) for speedy processing without any weblog or plotting.
casa_commands.log
'casa_commands.log' is a second useful file in 'pipelineXXXX/html', which lists all the CASA commands that the pipeline heuristics (hifv) tasks produced. Note that 'casa_commands.log' is not executable itself, but it contains all the CASA tasks and associated parameters to trace back the individual data reduction steps.
The Pipeline Weblog
The pipeline run can be inspected through a weblog that is launched by pointing a web browser to 'pipelineXXX/html/index.html' (wher XXX is the timestamp of the execution).
The following discussion is based on a weblog that can be obtained through the following link:
https://casa.nrao.edu/Data/EVLA/Pipeline/VLApipe-guide-weblog.tar.gz (209 MB)
Extract the weblog via:
# In a Terminal
tar xzvf VLApipe-guide-weblog.tar.gz
and point your browser to html/index.html.
At the top of the landing page Home (this page), By Topic and By Task provide navigation through the pipeline results.
Home Screen
The Home page of the weblog (Fig. 1) contains essential information such as the project archive code, the PI name, and the the start and end time of the observations. The CASA and pipeline versions that were used for the pipeline run are also listed on this page, as well as a table with the MS name, receiver bands, number of antennas, on source time, min/max baseline lengths, the atmospheric phase monitor rms, and the filesize.
Overview Screen
An overview of the observations (Fig. 2) can be obtained by clicking on the MS name.
This page provides additional information about the observation. That includes Spatial Setup (field names, target and calibrator names), Antenna Setup (min/max baseline lengths), Spectral Setup (band designations; science bands include most calibrators but exclude ancillary scans such as pointing scans), and Sky Setup (min/max elevation). The page also provides graphical overviews of the scan intent and field id observing sequence. A plot with weather information is also provided. Clicking the blue headers provides additional information on each topic.
The Spatial Setup page (Fig. 3) lists all sources and fields (where a source is a field with a given spectral setup). Names, IDs, positions, and scan intents are listed for each source/field.
The Antenna Setup (Fig. 4) page lists the locations of all antennas (antenna pad name and offset from array center) and contains a graphical location plot for the array configuration. On a second tab, baseline lengths are being listed and the 'percentile' column provides a rough indication of their density.
The Spectral Setup page (Fig. 5) contains all spectral window descriptions, including start, center and end frequencies, the bandwidth of each spw, as well as the number of spectral channels and their widths in frequency and velocity units. For each spw the polarization products and the receiver bands are listed, too. Note that Science Windows contain all spws that are used for calibration. Setup and pointing scans are not part of science windows but they are available under All Windows together with their intents.
Clicking the Sky Setup page (Fig. 6) leads to elevation vs Azimuth and Elevation vs Time plots for the entire observation. The plots are colorized by field and intent.
Scans (Fig. 7) provides a listing of all scans, including start and stop time stamps, durations, field names and intents, and the tuning (spw) setup for each. Again Science Scans and All Scans can be inspected in separate tabs.
Most of the above information can also be accessed by the 'LISTOBS OUTPUT' button. The link leads to the results of the listobs task, which includes the execution of the observational setup details of the MS (Fig. 8), including the scan characteristics, with execution times, scan ids, field ids and names, associated spectral windows, integration times, and scan intents. Further down, the spectral window characteristics are provided through their ids, channel numbers, channel widths, start and central frequencies. Sources and antenna locations are part of the listobs output, too.
By Topic Screen
The top-level By Topic link leads to a page that provides basic pipeline summaries such as warnings, scores (the scores are not implemented as of CASA 4.5.3 and should be considered placeholders for future implementation) and flagging summaries as functions of field, antenna, and spw (Fig. 9).
By Task Screen: Overview of the Pipeline Heuristic Stages
The pipeline is divided into 20 individual pipeline heuristic stages with heuristic ('hifv') tasks that are listed under the By Task tab (Fig. 10). Each stage has an associated score for success, but note that the scores are not yet implemented as of the CASA 4.5.3 VLA pipeline (C3R4B). Warnings and errors in tasks are also indicated by 'exclamation mark' and 'cross' icons near the task names. In our example, the pipeline threw warnings in stages 1 and 20, and an error in stage 4.
To obtain more details on each stage, click on the individual task names. Task sub-pages contain task results such as plots or derived numbers. Common to all pages is information on the (Pipeline QA; 'Quality Assurance', not implemented in the CASA VLA pipeline as packaged in 4.5.3), the heuristic task Input Parameters, Task Execution Statistics (benchmarks), and the CASA logs, which provide details on the heuristic output and the actual CASA tasks with all parameters that were issued. CASA logger outputs are in the same area.
The Individual Stages
1. hifv_importdata: Register VLA measurement sets with the pipeline
In the first stage, the data are imported from the SDM ('Science Data Model') archival data format to a CASA MeasurementSet (MS). Basic information on the MS is being provided, such as SchedBlock ID, the number of scans and fields and the size of the MS. The MS is also being checked for suitable scan intents and a baseline summary of the initial flags is calculated.
In our example (Fig. 11), a warning is issued that the data does not contain a CALIBRATE_BANDPASS scan intent. In such a situation, the pipeline will use the flux density calibrator scans for bandpass calibration.
CHECK for: any errors in the import stage. That includes missing scan intents as in our example. Warnings will also be issued if the data were previously being processed, this is usually encountered when the is run on a MS.
Stage 2. hifv_hanning: VLA Hanning Smoothing
This stage Hanning-smoothes the MS. This procedure will reduce the Gibbs phenomenon (ringing) when extremely bright and narrow spectral features are present spill over into adjacent spectral channels. Gibbs ringing is typically caused by strong RFI. As part of the process, Hanning smoothing will reduce the spectral resolution.
CHECK for: nothing except for completion of the task. FOR SPECTRAL LINE DATA: you may decide not to run this stage since spectral lines will be smoothed to a degraded spectral resolution (see also section on spectral line processing).
3. hifv_flagdata: VLA Deterministic flagging
This stage will apply flags that were generated by the VLA online system during the observations. They include antennas not on source (ANOS), scans with intents that are of no use for the pipeline such as pointing and focus scans, autocorrelations, the first and last 5% edge channels of each spw (with a minimum of 3), clipping absolute zero values, quacking (ie removing the initial integration per scan), and flagging of entire basebands if needed. The flags are reported as a fraction of the total data for the full dataset as well as broken up into the individual calibrator scans and target data. A plot is provided that displays the online antenna flags as a function of time.
In our example (Fig. 12), the target source starts with 3.12% flagged data, the deterministic flagging stage adds 6.05% for to antenna not on source, 0.82% of other online flags (e.g. subreflectors rotations or translations), edge channels amount to 6.4%, clipping of absolute zero values to 0.09%, and 1.40% of flags are due to baseband clipping. This combines to a total of 8.71% of flagged data for the scientific target. Other sources are also listed and the entire MS is being flagged on a 8.84% level.
CHECK for: the percentage of the flags. If a very large portion (or even all) of the visibilities of the calibrators are flagged, try to find out the reason. Also have a quick look at the graph of the online flags to understand whether the system behaved normally or if there was an unusually high failure of some kind.
4. hifv_setjy: Set calibrator model visibilities
Stage number 4 calculates and sets the calibrator spectral and spatial model for the standard VLA flux density calibrators (with a CALIBRATE_FLUX scan intent). The task page lists the calculated flux densities for each spw. It also shows plots of the amplitude vz uv-distance for the models per spw that are calculated and used to specify the flux density calibrator characteristics.
In our case, hifv_setjy throws an error (QA Too many flux calibrator measurements for 13A-398.sb17165245.eb19476558.56374.213876608796.ms 66/64; Fig. 13), which is due to the inclusion of pointing scans that are later disregarded flagged. This is nothing to worry here as the X-band pointing scans are not being used for any scientific application. Pointing corrections are being calculated and applied in the online system and are not being used thereafter.
CHECK for: any unexpected flux densities or model shapes.
5. hifv_priorcals: Priorcals (gaincurves, opacities, antenna positions corrections and rq gains)
Next, the prior calibration tables are being derived. They include gain-elevation dependencies, atmospheric opacity corrections, antenna offset corrections, and requantizer gains. They are independent of the calibrator observations themselves and can be derived from ancillary data.
In addition to the opacities themselves (calculated per spw; Fig. 14), a plot is attached that provides more information on the weather conditions during the observation. The antenna positions are usually updated a few days after an antenna was moved, and for our case corrections for four antennas are being applied with offsets in the millimeter range.
CHECK for: extreme or unrealistic opacities. Also check that the antenna offsets are within reason. There should only be updates for a few antennas.
6. hifv_testBPdcals: Initial test calibrations
Now it is time to determine the delays, and the bandpass solution (gain and phase) for the first time.
The plot on the main page (Fig. 15) shows the flux density calibrator with the bandpass solution applied. The subpages show the delay, gain amplitude, gain phase, bandpass amplitude, and bandpass phase solutions for each antenna. Note that the phases will be close to zero for the reference antenna. When delays are more than +/-10ns it will be worth examining the data more closely. Some additional flagging may be needed.
The gain apmlitude and phase solutions are derived per integration and they are used to correct for decorrelation before spectral bandpass solutions are being obtained. The latter are determined over a full solution interval, usually for all bandpass scans together. Bandpasses should be smooth although they can vary substantially for wide frequency bands. The BP phases should capture the residuals after the delays are determined.
Example delays are shown in Fig. 16: The delay for ea16 varies but is within a narrow range of only a few ns. These are good solutions. The delays for ea21 are fine except for the 33-35GHz frequency range where they scatter substantially. The respective frequency range/spws should be flagged manually if the following pipeline steps will not take care of it. For ea22 the delays in the 35-37GHz range are excessive with a value of about -70ns. It is likely that the pipeline will be able to calibrate these values correctly but one may need to flag the respective spws if not.
In Fig. 17, we show some examples of gain amplitude plots. Ea03 shows perfect solutions, whereas ea04 has elevated values until 8:06. Those should be flagged (but the pipeline may be able to detect and flag them in one of the subsequent stages). Some of the baselines in ea18 also show low values, but they are constant in time. At this stage one can assume that they reflect the correct calibration values. It might still be worth to make a note and check if calibration downstream was applied correctly. The situation is different for ea25 which shows an extreme decrease of flux as a function of time. This is likely a pointing error where the bandpass calibrator sits near the edge of the HPBW, a HPBW that varies with frequency. It is likely that the primary beam sensitivity thus creates the spread and the associated tracking error causes the flux changes. This antenna should be inspected carefully, there could be problem which will make it unusable. Although the bandpass solutions seem to be ok, the bandpass and flux calibrators coincide and it is likely that the absolute calibration is very unreliable for this antenna.
Since the gain amp/phase steps are only performed to reduce decorrelation, the phase plots are the most important in this context. In Fig. 18 we show a few solutions. All phases for the reference antenna ea09 are by definition zero. The phase variations as a function of time increase for higher frequencies and longer baselines. Therefore both, ea03 and ea21 have good solutions (ea03 is closer to the array center than ea21). There are no jumps in the phases - remember that -180 and +180 are identical phase values and jumps between those values are only a plotting issue, not the actual phase behavior.
Now let's have a look at the bandpasses themselves (Fig. 19). Ea17 shows very good bandpass solutions. Since the spws are small compared to the entire frequency range, the edges of each spw dominate the variations. This is even more extreme for the 37-39GHz range of ea18. Although this could be just the behavior of the antenna itself, it will be worth to keep an eye on this portion of the data as it may require flagging (and we will later see misbehavior for that antenna and frequency range in other plots). Some flagging was already performed for the 33-35GHz range of ea21. This range corresponds to the noisy delays that we saw earlier in Fig. 11b. Ea24 shows a few high values. They usually are fine as they are also the edges of the spws. In particular if an spw edge coincides with a baseband edge, such spikes may occur. Keep an eye on those although they are likely not a problem in the calibration. Finally, we show the bandpass of ea25. Although the Gain Amplitude showed decreasing values as a function of time (Fig. 17d), the bandpass itself does not look suspicious and can likely be used, based on this plot.
The BP phases (Fig. 20) show residual, channel by channel delays. Again the reference antenna ea09 only shows zero phases by definition. Ea11 is an example of perfect phases across a bandpass. Note that again the variations are dominated by residual phases at the edges of the spws. Some variations are larger than others, but they are all in a similar range. We already saw large scatter in the bandpass amplitude of ea18 at 37-39GHz and the pattern is repeated in the phases. So this portion may need extra care. Finally we show ea24 again and find that the edge spike in the amplitudes is also seen in the phases. At this level, the solution should be usable.
CHECK for: strong RFI and whether it was eliminated later or not (especially via a comparison with the output plots of task 14). Also check for jumps in phase and or amplitude away from spw edges. If there are phase jumps for all but the reference antenna, maybe a different choice for the reference antenna should be considered. Also watch out for extreme delays of tens of ns and for very noisy data.
7. hifv_flagbaddef: Flag bad deformatters
The data inside every VLA telescope is undergoing a formatting stage to convert the electronic to an optical signal before it is injected on the optical fiber link. On the correlator end the signal will be deformatted back to an electronic signal. Occasionally, the timing on the deformatter can be misaligned which results in a signal similar to a abs(sin), or a 'bouncing' signal across a baseband for one polarization. The hifv_flagbaddef pipeline stage tries to identify such deformatter errors and flag the respective baseband for the affected combination of antenna and polarization. Similar deviations are being identified for the phases of the signals, but for those cases it is sufficient to flag individual spws and not the entire basebands.
For our data, no deformatter issues were detected in the data for the amplitudes but the phases of a few spws are being flagged (Fig. 21). It is always advisable to visually inspect the data, as sometimes deformatter problems are not being identified by the algorithm. E.g. wide 'bounces', or values that don't approach zero may be missed. An example from a different dataset is provided in Fig. 22. The 'V' shape close to 5.3GHz with values close to zero are typical for a deformatter issue. If the pipeline does not detect and flag the affected baseband automatically, one should manually flag the entire baseband of the affected polarization of that antenna for all sources.
CHECK for: amplitude 'bounces', i.e. very strong variations of amplitude with low values close to zero and high values well above the average of the other polarization. The pattern can repeat a few times across a baseband but should be contained to a single baseband, antenna and polarization. All spws in a faulty baseband, however, are affected. Also check the phases that this step may have flagged.
8. hifv_checkflag: Flag possible RFI on BP calibrator using rflag
Rflag as part of flagdata is a threshold-based automatic flagging algorithm in CASA. In this step, rflag is being run on the bandpass calibrator to remove relatively bright RFI and to obtain improved bandpass calibrations tables later on.
CHECK for: nothing in particular on this page, but some cumbersome RFI may have been eliminated in preparation of the following steps.
9. hifv_semiFinalBPdcals: Semi-final delay and bandpass calibrations
Now that some RFI was flagged, stage 6 is being repeated here at stage 9, which results in better bandpass and delay solutions.
CHECK for: similar issues as in step 6.
10. hifv_checkflag: Flag possible RFI on BP calibrator using rflag
Once more rflag is being executed. After the bright RFI has been removed in step 8 and a new bandpass solution has been applied in step 9, a new mean data threshold will account for weaker rfi, which will be removed in this step 10.
CHECK for: removal of RFI seen in the following steps.
11. hifv_semiFinalBPdcals: Semi-final delay and bandpass calibrations
Again, having removed more RFI, new delay and bandpass solutions are being obtained here.
CHECK for: similar to step 6.
12. hifv_solint: Determine solint and Test gain calibrations
For the final calibration, the pipeline determines the shortest and longest applicable solution interval (solint). Typically they refer to the length of an integration and a scan, respectively.
In our case (Fig. 23) the integration time is 3 seconds which also corresponds the shortest solution interval. The longest solution interval is likely based the phase calibrator scans which typically last for ~85s, minus the drive time and 'quack' flagging, the longest solution results in ~75s.
Using these solutions, a temporal gain and phase solution is calculated for each antenna, spw, and polarization. In Fig. 24 we show some examples for the gains. Ea03 shows perfect gain solutions which small variations over the time of the observations. Note that the last scan is the flux density calibrator and thus a different source. Ea04 shows increased values for the last few calibrator scans that may need to be flagged. Ea25 has likely a pointing error for the first half of the observations. The listobs output tells us that a pointing update was obtained around 6:40 at which point ea25 starts to show good solutions.
Although the phase solution plots are very crowded (Fig. 25), we can see that ea03 has very steady values over time. The pipeline will apply phase offsets determined from this solution, so later on, additional phase solutions will be close to zero. Ea04 shows larger variations, and ea09 is the phase reference. Although the phases should in principle be all zero for the phase reference, offsets are visible that do not matter, all phases are relative, whether to zero or to an arbitrary offset.
CHECK for: consistency with the data. The shortest solint should be close to the integration time and the longest to a calibration scan. Gains should be smooth with little variations in time (larger gain variations for higher frequencies), phases should not show any jumps and also be relatively smooth (larger phase variations for higher frequencies and longer baselines).
13. hifv_fluxboot: Gain table for flux density bootstrapping
Now, the fluxes are bootstrapped from the flux calibrator to the complex gain (gain and phase) calibrator. To do so, spectral indices are computed for the secondary calibrator and the absolute fluxes are determined for each channel. They are then set to the MODEL column via setjy and reported for each spw.
For our example, the pipeline derives fluxes between 0.61 and 0.68 Jy, depending on frequency. The spectral behavior is reported as a declining spectral index of around -0.5 (Fig. 26).
CHECK for: that the values are close to the known fluxes of the calibrator. Check the VLA calibrator manual at https://science.nrao.edu/facilities/vla/observing/callist for consistency. Since most calibrator sources are time variable AGN, some differences to the VLA catalog are expected. In particular at higher frequencies they could be up to tens of per cent.
14. hifv_finalcals: Final Calibration Tables
The final calibration tables are now being obtained. Those are the most important ones as they are the ones that are being applied to the data in the subsequent stage 15. The tables are (one for each antenna): Final delay plots, BP initial gain phase, BP Amp solution, BP Phase solution, Phase (short) gain solution, Final amp time cal, Final amp freq cal, and Final phase gain cal. We have already inspected and discussed similar plots for the bandpass and for the temporal gain/phase calibration earlier. But since these are the most important tables, let's have a closer look at a few more tables.
The gains vary quite a bit for this observation. Typically, the gains stay within 10% around a normalized value of 1. Here, a few spws show substantial deviations. Examples are (Fig. 27): Ea02 has a drop around 5:50 and should be checked. Maybe the entire time between the adjacent calibrator scans, those that are more in line with the rest, may be flagged for this antenna. Ea04 has an inverse behavior later, around 8:00. It appears that only a subset, e.g. a baseband deviates from the rest. Ea07 is more smooth, with some variations between the individual spws but overall a smooth temporal behavior. Likely this solution can be used with no further flagging. Note that the last scan is the flux calibrator. This datum is expected to have slightly different values than those for the complex gain calibrator. Next, we look at the almost perfect ea09 gains. This is also the reference antenna. The gains in ea18 are smooth with a large dip in the first half. This is in fact does calibrate out some characteristics of the observations and could be left for the moment. As mentioned before, around 6:40, a pointing update was performed which seems to have rectified a possibly mis-pointed ea18. Ea23 requires a single spw at a single time to be flagged. Pointing errors are also obvious for ea25 which also shows a substantial spread across spws. We have previously seen the large decline in flux in the bandpass observations for ea25, with a different slope for each spw. This seems to be reflected here, too. It is advisable to check the calibrated gains for this source and flag data is the spw amplitude variations were not calibrated properly.
Now let's have a look at the gains as a function of frequency (Fig. 28). For ea02 we see that one line is below the rest. This is likely one specific time interval and indeed we have seen this in Fig. 22a. Ea04 has a very noisy time interval, which is also in agreement with what we have seen in the previous temporal gain plot. Ea08 shows a perfect calibration and ea20 repeats the extra noise in the 34-35GHz range that may need to be flagged. Ea25 now repeats the bandpass pattern that we have seen earlier and that explains the spread in Fig. 27.
Now let's have a look at the phases (Fig. 29). Ea02 clearly shows very erratic gain variations for one baseband or polarization. This is likely not recoverable. Ea04, in contrast exhibits very smooth phase variations until close to the end of the observations. This has already been observed in the gains (Fig. 27b), should be looked at and likely needs to be flagged. Ea09 is the reference antenna and per definition zero in phase. Ea13 shows smooth variations and an example for a perfect calibration table. A spread between basebands or polarizations can be seen for ea15. The behavior is nevertheless smooth and the data should be calibrated nicely with this table. Ea17, however, has, in addition to different behaviors for the basebands, also relatively large and erratic jumps between the calibration scans. This clearly needs to be looked into further and may need flagging, although the antenna did not show any issues in previous plots. Finally, ea20 has a relatively smooth behavior until the pointing update was performed (although the variations are relatively large). After the pointing scan, however, phases vary by about +/-50degree between individual, consecutive calibrator scans, which is large enough to be unreliable and to be flagged.
CHECK for: issues similar to those described in stages 6 and 12. Note that carefully checking calibrator tables in this stage 14 is of particular importance as they are the final tables to be applied to the target source. Phase cal solutions should be inspected in their temporal variations to be smooth and consistent for each calibrator.
15. hifv_applycals: Apply calibrations from context
The calibration itself now concludes with the application of the derived calibration tables on the entire dataset. That includes all calibrators as well as the target sources. Note that there's no system temperature weighting of the caltables for the VLA (calwt=F) since the switched power calibration is currently not being used.
In Fig. 30, we show the results of this step. The first table shows which tables are being applied, the fields, spws, and antennas that are being calibrated. The second table provides information on the flagging statistics. Failed calibration solutions result in flagged calibrator table entries and eventually the data will also be flagged as no calibration can be derived for such data. The following plots show the data of different calibrator sources and spws in different combinations of phase and amplitude against frequency and uv-distance. To start with, the amplitude and phase as a function of frequency are being plotted for the complex gain/phase calibrator for each baseband. Next, the amplitudes as a function of uv-distance are plotted for the flux calibrator for each spw. They are followed by amp/time plots for all sources. Finally the amp and phases against time and amplitude against frequency of the target sources are being plotted for each baseband.
In Fig. 31 we show a few examples. a) An spw that drops in the calibrated spectrum indicates that it is mis-calibrated. b) Although the phases are well calibrated, residual delays are still visible. The zig-zag pattern is due to a small mismatch in the delay measurement timing (aka. 'delay clunking'). This behavior is intrinsic to the VLA. Typically the effect is averaged out over time. c) Amplitude vz uv-distance for the flux calibrator should show a flat behavior for each spw (uv radial dependencies are taken account for by the setjy model). The offset may indicate one antenna or one time that is offset from the others. This should be inspected further. d) Amplitude vz frequency for a calibrator shows a baseline or time that should be investigated and may need additional flagging. For the target source (e) the data is usually noisy and systematic issues are difficult to identify.
CHECK for: a smooth amplitude vz frequency plot. Jumps may indicate mis-calibrated bandpass fluxes for a spw. Also the shape of the individual spws should be largely removed. Check for deviations from a flat amplitude vz uv-distance plot after the model was applied as it could indicate badly calibrated times or antennas.
16. hifv_targetflag: Targetflag
After the calibration tables are applied, the automated flagging routing rflag is run one more time on all sources to remove RFI and other outliers from the data.
CHECK for: RFI removal in the target. Although flagging is performed for all fields, the calibration is done in a previous stage and has any additional has no more influence on the calibration tables. Flagging may improve, however, all images being made, in particular for the target fields that are flagged at this stage for the first time. FOR SPECTRAL LINE DATA: do not run this step as spectral lines may be removed, too (see also below).
17. hifv_statwt: Reweight visibilities
Since the VLA pipeline is currently not using the switched power calibration, there can be some sensitivity variations of the data over time due to changes in opacity, elevation, temperature (gradients) of the antennas, etc. So it is usually advisable to weigh the data according to the inverse of its noise. This is done via the CASA task statwt and will increase the signal-to noise ratio of images. Note that features such as RFI spikes and spectral lines will be part of the rms calculations and usually results in down-weighting data that includes such features.
CHECK for: improved signal-to-noise in the images. FOR SPECTRAL LINE DATA: do not run this step as spectral lines may be weighted down (see also below).
18. hifv_plotsummary: VLA Plot Summary
This task produces diagnostic plots of the final, calibrated data. This includes phase as a function of time for all sources (calibrators and target), as well as amplitude against uv-distance.
Fig. 32 shows that the calibration around 6:00 and 6:30 is still somewhat noisy and additional flagging of the calibrators may be required. Field 12, looks as expected. One may want to check why some values in field 0 are very low and others in field 11 are quite high. Those could correspond to individual antennas, spws, or polarizations. Again, some editing may be required and the pipeline restarted.
CHECK for: outliers, jumps, offsets, and excessive noise.
19. hif_makeimlist: Compile a list of cleaned images to be calculated
Finally, diagnostic images are being made for each spw of the phase calibrator. The images and basic parameters such as resolution (cell size) and image sizes are listed in this step and are available in the directory that the pipeline was run (usually where the SDM is located). At this time, images are being produced for each spw using the multi-frequency synthesis algorithm, ie. in continuum mode that will not have any spectral dependencies and uses the improved uv-coverage due to the frequency range of each spw.
In Fig. 33 the images are listed with 0.31" cell/pixel size and 300 pixels on each side. Names and phase centers are given for each spw.
CHECK for: appropriate cell size for the images.
20. hif_makeimages: Calculate clean products
The images from the previous stage are shown in the final pipeline task.
Imaging parameters are provided for each image (Fig. 34). They contain beam characteristics, as well as image statistics and how they compare to the theoretical noise, noise balues that are based on bandwidth and integration time for typical VLA array parameters (via the radiometer equation). As mentioned earlier, the quality score is not fully implemented in this version of the pipeline and should be ignored.
CHECK for: degraded images, strong ripples, calibrators that do not resemble the psf. Such images may indicate RFI or mis-calibrated sources. If the actual rms is far from the theoretical noise, this could indicate that deeper cleaning is required. But that may not be important for these images.
Re-Execution of the Pipeline after Flagging
Above we mention many cases where additional flagging might be required. After the additional flagging was implemented, the pipeline can be re-executed for improved solutions. We do recommend to turn of Hanning smoothing for all re-executions given that the data were already Hanning-smoothed and that flags will be extended by smoothing an already flagged MS.
By default, the pipeline will always revert all flags back to their original state that are saved in the MeasurementSet.flagversions file. It will thus ignore all modification made.
To trick the pipeline, one should manually flag the MeasurementSet and place it in a new directory. Do NOT copy over the related MeasurementSet.flagversions file. Then run the pipeline with the flagged MS for input to hifv.hifv('MeasuerementSet']). In that case the pipeline will not be able to recover original flags and will proceed with the manual flags that the user has applied.
Re-Applying Pipeline Results
Pipeline calibration tables can be re-applied to raw data following the description given on the VLA pipeline webage.
Spectral Line Data
The pipeline is not optimized for calibrating spectral line data. Some pipeline steps may be detrimental for spectral line setups and need to be turned off. The calibrators also require enough signal-to-noise to reliably derive bandpass, gains, phases, etc for the likely more narrow spectral line subbands. The pipeline will also flag edge channels for each spw. If the spectral line happens to be placed on spw edges, additional modifications to the script may be necessary.
As mentioned above the following pipeline steps are not advisable for spectra line data:
Stage 2: hifv_hanning: Hanning smoothing dampens the Gibbs ringing from strong spectral features, usually strong, narrow RFI. Hanning smoothing, however, reduces the spectral resolution, which may not be desired for spectral line. In addition, the typically narrow spectral line spws contain less rfi and Hanning may not be required. This step is therefore typically turned off for spectral line calibration.
Stage 16. hifv_targetflag: Previous flagging was only applied on the calibrator scans. But Stage 16 attempts to auto-flag all fields including target fields. The rflag mode in flagdata is designed to remove outliers from a mean level. (Strong) spectral lines can fulfill this criterion and be flagged. This step should therefore be turned off, too. We recommend manual flagging or a flagdata rflag run that excludes spectra ranges with lines. This step, however, needs to be performed manually after the spectral line pipeline execution.
Stage 17. hifv_statwt: A similar argument applies for the statwt step, where the visibilities are weighted by the inverse of their rms. Spectral ines will increase the rms and will therefore be down-weighted. As this is not desired, the step will be turned off and should be run manually afterwards, tuned to exclude the spectral line frequency range from the weight calculations.
Given the above, we recommend to modify the casa_pipescript.py as in the example below. Surely, the SDM name (here: '13A-398.sb17165245.eb19476558.56374.213876608796' and maybe other parameters will have to be adapted for the run:
__rethrow_casa_exceptions = True h_init() try: hifv_importdata(ocorr_mode='co', vis=['13A-398.sb17165245.eb19476558.56374.213876608796'], createmms='automatic', asis='Receiver CalAtmosphere', overwrite=True) # hifv_hanning(pipelinemode="automatic") hifv_flagdata(intents='*POINTING*,*FOCUS*,*ATMOSPHERE*,*SIDEBAND_RATIO*, *UNKNOWN*, *SYSTEM_CONFIGURATION*, *UNSPECIFIED#UNSPECIFIED*', flagbackup=False, scan=True, baseband=True, clip=True, autocorr=True, hm_tbuff='1.5int', template=False, online=True, tbuff=0.0, fracspw=0.05, shadow=True, quack=True, edgespw=True) hifv_vlasetjy(fluxdensity=-1, scalebychan=True, reffreq='1GHz', spix=0) hifv_priorcals(pipelinemode="automatic") hifv_testBPdcals(pipelinemode="automatic") hifv_flagbaddef(pipelinemode="automatic") hifv_checkflag(pipelinemode="automatic") hifv_semiFinalBPdcals(pipelinemode="automatic") hifv_checkflag(checkflagmode='semi') hifv_semiFinalBPdcals(pipelinemode="automatic") hifv_solint(pipelinemode="automatic") hifv_fluxboot(pipelinemode="automatic") hifv_finalcals(pipelinemode="automatic") hifv_applycals(pipelinemode="automatic") # hifv_targetflag(pipelinemode="automatic") # hifv_statwt(pipelinemode="automatic") hifv_plotsummary(pipelinemode="automatic") hif_makeimlist(nchan=-1, calmaxpix=300, intent='PHASE,BANDPASS') hif_makeimages(masklimit=4, noise='1.0Jy', subcontms=False, target_list={}, parallel='automatic', maxncleans=1, weighting='briggs', tlimit=2.0, robust=-999.0, npixels=0) finally: h_save()
We commented out the stages 2, 16 and 17. If a spectral line happens to be close to the edge channels, one can decide to turn off edge channel flagging by adding the parameter edgespw=False to all calls of hifv_flagdata.
Once all modifications are made, run the pipeline as:
# In CASA
execfile('casa_pipescript.py')
After the calibration has been obtained, we can now follow up with an rflag and statwt stage. To start with, we need to define a spw range that contains no lines. Let's assume spw='*:4~180;215~509' is such a range (a line would be in every subband in channels 181 to 214). The flagdata can be run as follows:
# In CASA
flagdata(vis='my.ms',mode='rflag',spw='*:4~180;215~509',mode='apply')
The spw selection criteria will only flag the line-free part of the spectrum.
statwt could be run like:
# In CASA
statwt(vis='my.ms',fitspw='*:4~180;215~509')
where fitspw selects the portion of data that is used for the weight calculation. Again, we use the line-free part of the spectrum.
Mixed Correlator Setups
If data were obtained in mixed correlator modes, one can initially run the pipeline with the above spectral line modifications. Then use split in CASA to extract the continuum bands and run the pipeline in regular mode on the new MS.
Polarization Calibration
At this stage, the VLA pipeline does not derive and apply polarization calibration. The user may decide to derive and add polarization calibration steps after the pipeline was run, using the pipeline calibration tables as required.
Polarization calibration steps are provided in the respective section of the 3C391 tutorial (in particular the D-term and crosshand delay calibration will be required). See also the corresponding chapter in the CASA Reference Manual and Cookbook.
Weak Calibrators
The VLA pipeline requires a minimum of signal-to-noise of 3 for each spw (channel for the bandpass) and target scan. If this criterion is not met, the pipeline will likely fail. We are currently implementing additional heuristics to deal with weak calibration sources. This code will be available in the upcoming version of the VLA pipeline.
Correcting Scan Intents
Scan intents should be set up correctly in the OPT before submitting the schedule block for observation.
When incorrect scan intents are identified after observations, the SDM can be modified and updated with new scan intents. The SDM metadata is structured in the form of an XML and can be manually edited. Great care, however, should be taken not to corrupt the structure of the SDM/xml.
To do so, cd into the SDM and edit the file 'Scan.xml'. We recommend strongly to make a backup copy of the Scan.xml file in case the edits corrupt the metadata.
Scan.xml is divided into individual <row></row> blocks that identify each scan.
E.g. the first scan in our dataset is listed like:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>1</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 1 OBSERVE_TARGET</scanIntent> <calDataType>1 1 NONE</calDataType> <calibrationOnLine>1 1 false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
We can now change the scan intent, e.g. to CALIBRATE_AMPLI by simply updating the <scanIntent> tag:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>1</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 1 CALIBRATE_AMPLI</scanIntent> <calDataType>1 1 NONE</calDataType> <calibrationOnLine>1 1 false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
If we want to add a second intent, we will have to make additional changes. Let's add CALIBRATE_PHASE:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>2</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 2 CALIBRATE_AMPLI CALIBRATE_PHASE</scanIntent> <calDataType>1 2 NONE NONE</calDataType> <calibrationOnLine>1 2 false false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
Inside <scanIntent> we added the second intent, but also increased the second number from 1 to 2. In addition, we specified <numIntent> to be 2, and added a second entry to <calDataType> and <calibrationOnLine>. For the latter two, we also updated the second number from 1 to 2.
Analoguously, if we now add a a third intent, CALIBRATE_BANPDASS to the same scan, the <row> will look like:
<row> <scanNumber>1</scanNumber> <startTime>4870732142800000000</startTime> <endTime>4870732322300000256</endTime> <numIntent>3</numIntent> <numSubscan>1</numSubscan> <scanIntent>1 3 CALIBRATE_AMPLI CALIBRATE_PHASE CALIBRATE_BANDPASS</scanIntent> <calDataType>1 3 NONE NONE NONE</calDataType> <calibrationOnLine>1 3 false false false</calibrationOnLine> <sourceName>J1041+0610</sourceName> <flagRow>false</flagRow> <execBlockId>ExecBlock_0</execBlockId> </row>
Check with listobs on the imported MS (after executing importasdm or importevla) if the scan intents are now displayed correctly.
Valid intents are (see SDM definition document):
CALIBRATE_AMPLI : Amplitude calibration scan CALIBRATE_ATMOSPHERE : Atmosphere calibration scan CALIBRATE_BANDPASS : Bandpass calibration scan CALIBRATE_DELAY : Delay calibration scan CALIBRATE_FLUX : flux measurement scan. CALIBRATE_FOCUS : Focus calibration scan. Z coordinate to be derived CALIBRATE_FOCUS X : Focus calibration scan; X focus coordinate to be derived CALIBRATE_FOCUS Y : Focus calibration scan; Y focus coordinate to be derived CALIBRATE_PHASE : Phase calibration scan CALIBRATE_POINTING : Pointing calibration scan CALIBRATE_POLARIZATION : Polarization calibration scan CALIBRATE_SIDEBAND_RATIO : measure relative gains of sidebands. CALIBRATE_WVR : Data from the water vapor radiometers (and correlation data) are used to derive their calibration parameters. DO_SKYDIP : Skydip calibration scan MAP_ANTENNA_SURFACE : Holography calibration scan MAP_PRIMARY_BEAM : Data on a celestial calibration source are used to derive a map of the primary beam. OBSERVE_TARGET : Target source scan CALIBRATE_POL_LEAKAGE : CALIBRATE_POL_ANGLE : TEST : used for development. UNSPECIFIED : Unspecified scan intent CALIBRATE_ANTENNA_POSITION : Requested by EVLA. CALIBRATE_ANTENNA_PHASE : Requested by EVLA. MEASURE_RFI : Requested by EVLA. CALIBRATE_ANTENNA_POINTING_MODEL : Requested by EVLA. SYSTEM_CONFIGURATION : Requested by EVLA. CALIBRATE_APPPHASE_ACTIVE : Calculate and apply phasing solutions. Applicable at ALMA. CALIBRATE APPPHASE PASSIVE : Apply previously obtained phasing solutions. Applicable at ALMA. OBSERVE_CHECK_SOURCE
Revert back to the original Scan.xml if the above was not successful and contact NRAO through the NRAO helpdesk.
Scripted Pipeline
In addition to the pipeline that is delivered with CASA, one can also use the VLA scripted pipeline. More modifications are possible to the scripted pipeline and it can be altered to almost any circumstance. We refer to the VLA scripted Pipeline webpage for details.
Last checked on CASA Version 4.5.3