VLA CASA Pipeline-CASA6.1.2: Difference between revisions

From CASA Guides
Jump to navigationJump to search
Jott (talk | contribs)
Jott (talk | contribs)
Line 96: Line 96:
== By Task Screen: Overview of the Pipeline Heuristic Stages ==
== By Task Screen: Overview of the Pipeline Heuristic Stages ==


The pipeline is divided into 19 (20 when including the exportdata stage) individual pipeline heuristic stages with heuristic ('hif' or 'hifv' for ''heuristics interferometric [vla]'') tasks listed under the [https://casa.nrao.edu/Data/EVLA/Pipeline/CASA6.1.2/html/t1-4.html By Task] tab (Fig. 10). Each stage has an associated score for success. If there are informational messages, warnings, or errors in tasks, they are indicated by '?', '!', and 'x' icons near the task names, respectively. In our example, a few warnings are issued, mostly related to flagging. Note, however, that warnings use antenna ids and not antenna names. They are usually not identical and the mapping between them is shown on the '''Antenna Setup''' page.  
The pipeline is divided into 19 (20 when including the exportdata stage) individual pipeline heuristic stages with heuristic ('hif' or 'hifv' for ''heuristics interferometric [vla]'') tasks listed under the [https://casa.nrao.edu/Data/EVLA/Pipeline/CASA6.1.2/html/t1-4.html By Task] tab (Fig. 10). Each stage has an associated score for success. If there are informational messages, warnings, or errors in tasks, they are indicated by '?', '!', and 'x' icons near the task names, respectively. In our example, an informational message and a few warnings are issued, mostly related to flagging. Note, however, that warnings use antenna ids and not antenna names. They are usually not identical and the mapping between them is shown on the '''Antenna Setup''' page.  


[[Image:VLApipe-tasks-CASA6.1.2.png|400px|thumb|center|Fig. 10: The '''By Task''' pipeline execution stages.]]
[[Image:VLApipe-tasks-CASA6.1.2.png|400px|thumb|center|Fig. 10: The '''By Task''' pipeline execution stages.]]

Revision as of 23:23, 18 December 2020

This gu6.1.2ide is designed for CASA 6.1.2.


Introduction

When VLA observations are complete, the raw data need to be calibrated for scientific applications. This is achieved through various steps, as explained in the VLA CASA tutorials. The different calibration procedures are also bundled in a general VLA calibration pipeline that is described on the VLA pipeline webpage. At NRAO, the pipeline is executed on every science scheduling block (SB) that the VLA observes successfully. At this time, scientific target imaging is not part of the VLA pipeline. Manual imaging steps, however, are explained in the VLA CASA tutorials. The VLA pipeline webpage describes how to run, modify, and re-execute the VLA pipeline. There are also instructions on how to restore archived pipeline products as well as a list of known issues. In the following material, we provide an example of a VLA pipeline weblog, explain the different pipeline stages, describe some of the diagnostic information and plots, and point out potential issues with the data or the pipeline results.

The Pipeline Weblog

The pipeline run can be inspected through a weblog that is launched by pointing a web browser to file:///<path to your working directory>/pipelineTIME/html/index.html . Note that we regularly test the weblog on Firefox but less so on other browsers. So if you don't use Firefox, there's a chance that not all items are displayed correctly.

The following discussion is based on a weblog that can be viewed through the following link:


Pipeline Weblog


Alternatively, the weblog can be downloaded from https://casa.nrao.edu/Data/EVLA/Pipeline/VLApipe-guide-weblog-CASA6.1.2.tar.gz (188 MB)

and extracted via:

# In a Terminal
tar xzvf VLApipe-guide-weblog-CASA6.1.2.tar.gz

then point your browser to html/index.html (as of CASA 6.1.2 Firefox seems to work best, although there can be a security setting that needs to be set first; the weblog will prompt you with instructions if this is the case. Chrome may not show all items properly unless started like Chrome --args --allow-file-access-from-files /path/to/weblog//html/index.html).

At the top of the landing page one can find the items Home (the index.html landing page), By Topic and By Task that provide navigation through the pipeline results.

Home Screen

The Home page of the weblog (Fig. 1) contains essential information such as the project archive code, the PI name, and the start and end time of the observations. The CASA and pipeline versions that were used for the pipeline run are also listed on this page, as well as a table with the MS name, receiver bands, number of antennas, on source time, min/max baseline lengths, the atmospheric phase monitor rms, and the file size.

Fig. 1: The main page of the weblog

Overview Screen

An Overview of the observations (Fig. 2) can be obtained by clicking on the MS name.

Fig. 2: The weblog overview page.


This page provides additional information about the observation. It includes Observation Execution Time (date, time on source), Spatial Setup (science target and calibrator field names), Antenna Setup (min/max baseline lengths, number of antennas and baselines), Spectral Setup (band designations, including VLA baseband information; science bands include most calibrators, but exclude pointing and setup scans), and Sky Setup (min/max elevation). The page also provides graphical overviews of the scan intent and field ID observing sequence. A plot with weather information is also included. Clicking the blue headers provides additional information on each topic.


The Spatial Setup page (Fig. 3) lists all sources and fields (where a source is a field with additional information, e.g. it could describe flux variations). Names, IDs, positions, and scan intents are listed for each source/field.

Fig. 3: Spatial Setup page.


The Antenna Setup (Fig. 4) page lists the locations of all antennas (antenna pad name and offset from array center) and contains graphical location plots for the array configuration (one linear and one logarithmically scaled for better separation of close antenna labels). A third plot shows a representative uv-coverage. On a second tab, baseline lengths are listed and the 'percentile' column provides a rough indication of how many baselines are shorter than that in each row.

Fig. 4a: Antenna Setup page (Antennas).
Fig. 4b: Antenna Setup page (Baselines).


The Spectral Setup page (Fig. 5) contains all spectral window descriptions, including start, center and end frequencies, the bandwidth of each spectral window (spw), as well as the number of spectral channels and their widths in frequency and velocity units. For each spw, the polarization products and the receiver bands are also listed. The real id is the spw id of each SB, the virtual id, is a renumbered identifier when multiple SBs are combined (currently only an ALMA option).

Note that Science Windows contain all spws that are used for calibration. Setup and pointing scans are not part of science windows but they are available under All Windows together with their intents. (Note though that in our case, however, pointing scans are mistakenly identified as science scans, this is due to its peculiar data structure, which labeled a setup scan as a target scan).

Fig. 5: Spectral Setup page.


Clicking the Sky Setup page (Fig. 6) leads to elevation versus Azimuth and Elevation versus Time plots for the entire observation and once more a representative uv-coverage. The temporal plots are colorized by field id.

Fig. 6: Sky Setup page.


Scans (Fig. 7) provides a listing of all scans, including start and stop time stamps, durations, field names and intents, and the tuning (spw) setup for each. Again Science Scans and All Scans can be inspected in separate tabs.

Fig. 7: Scans page.

Most of the above information can also be accessed by the 'LISTOBS OUTPUT' button. The link leads to the output of the CASA listobs task, which summarizes the details of the observations (Fig. 8), including the scan characteristics, with observing times, scan ids, field ids and names, associated spectral windows, integration times, and scan intents. Further down, the spectral window characteristics are provided through their ids, channel numbers, channel widths, start and central frequencies. Sources and antenna locations are part of the listobs output, too.

Fig. 8: The listobs output.

By Topic Screen

The top-level By Topic link leads to a page that provides basic pipeline summaries such as warnings, the four lowest QA scores (see below), and flagging summaries as functions of field, antenna, and spectral window (spw; Fig. 9). Links are provided to jump directly to the pipeline step that issued the warning or low score.


Fig. 9: The By Topic page of the weblog.

By Task Screen: Overview of the Pipeline Heuristic Stages

The pipeline is divided into 19 (20 when including the exportdata stage) individual pipeline heuristic stages with heuristic ('hif' or 'hifv' for heuristics interferometric [vla]) tasks listed under the By Task tab (Fig. 10). Each stage has an associated score for success. If there are informational messages, warnings, or errors in tasks, they are indicated by '?', '!', and 'x' icons near the task names, respectively. In our example, an informational message and a few warnings are issued, mostly related to flagging. Note, however, that warnings use antenna ids and not antenna names. They are usually not identical and the mapping between them is shown on the Antenna Setup page.

Fig. 10: The By Task pipeline execution stages.

To obtain more details on each stage, click on the individual task name. Task sub-pages contain task results such as plots or derived numbers. Common to all pages is information on the Pipeline QA ('Quality Assurance'), the heuristic task Input Parameters, Task Execution Statistics (benchmarks), and the CASA logs. Those sections provide information on the triggered heuristics, as well as the actual CASA task execution commands and their return logger messages.

The QA scores have the following meaning:

  • 0.9-1.0 Standard/Good: green color - the stage appears to have completed successfully
  • 0.66-0.90 Below Standard: blue color - the stage has identified some issues, but they are not likely to affect the results substantially. It is still worth to check though.
  • 0.33-0.66 Warning: yellow color - there are serious issues identified in this stage. The results should be inspected carefully. Intervention may be needed.
  • 0.00-0.33 Error: red color - there are severe problems with the data processing. It may or may not be possible to rescue the data.


The Individual Stages

Before we go through the stages step by step, it is worth mentioning that the lines in the calibration table plots connect data along the x-axis when they have otherwise the exact same properties (i.e. same spw, field, polarization, etc.). When data are flagged, the connector will not be plotted, so only consecutive, non-flagged data, with the same properties are connected and gaps between data with the same color indicate flagged data.

Stage 1. hifv_importdata: Register VLA measurement sets with the pipeline

In the first stage, the raw SDM-BDF is imported into the VLA pipeline. An MS is created and basic information on the MS is provided, such as SchedBlock ID, the number of scans and fields, and the size of the MS. The MS is also checked for suitable scan intents and a summary of the initial flags is calculated (check the "CASA logs" attached to the bottom of the page). "Flux densities" is used for ALMA and is not relevant to VLA data at this time.

Fig. 11: The Stage 1 hifv_importdata task page.
CHECK for: any errors in the import stage. Warnings will also be issued for missing, necessary scan intents or if the data had previously been processed. This is usually encountered when the pipeline is run on an MS rather than an SDM.  
QA: If the INTENT PHASE or FLUX are missing, the score will be set to 0. An existing processing history will set it to 0.5.

Stage 2. hifv_hanning: VLA Hanning Smoothing

This stage Hanning-smooths the MS. This procedure reduces the Gibbs phenomenon (ringing) when extremely bright and narrow spectral features are present and spill over into adjacent spectral channels. Gibbs ringing is typically caused by strong RFI or a strong maser line. As part of the process, Hanning smoothing will reduce the spectral resolution by a factor of 2 while maintaining the same number of channels. (Note: this means that data in adjacent channels will no longer be independent.) Hanning smoothing is turned off when any spectral window (spw) was frequency-averaged inside the WIDAR correlator. For such data, Hanning smoothing cannot correct for the Gibbs phenomenon anymore and would only add additional smearing.

CHECK for: nothing except for completion of the task. FOR SPECTRAL LINE DATA: you may decide not to run this stage since spectral lines will be smoothed to a degraded spectral resolution.
qa: N/A

Stage 3. hifv_flagdata: VLA Deterministic flagging

This stage applies flags that were generated by the VLA online system during the observations. The flags include antennas not on source (ANOS), shadowed antennas, scans with intents that are of no use for the pipeline (such as pointing and setup scans), autocorrelations, the first and last 5% edge channels of each spectral window (with a minimum of 1 channel), clipping absolute zero values that the correlator occasionally produces, quacking (i.e. flagging start or end integrations of scans; the pipeline will flag the first integration after a field change), and flagging the end 20MHz of the top and bottom spw of each baseband. The flags are reported as a fraction of the total data for the full dataset as well as broken up into the individual calibrator scans and target data. A plot is provided that displays the online antenna flags as a function of time.

A flagging template can also be provided to the pipeline which applies known flags to the data (see The VLA Pipeline Webpage).

In our example (Fig. 12), the target sources start with 3.12% flagged data; the deterministic flagging stage adds 6.05% for antenna not on source; 0.82% of other online flags (e.g., subreflector rotations or translations); edge channels amount to 6.4%; clipping of absolute zero values to 0.09%; quack removes bad first integrations (0%) in scans; and 1.4% of flags are due to baseband clipping. This combines to a total of 8.71% of flagged data for the scientific targets. Other sources are also listed and the entire MS is flagged on a 8.84% level. No flagging template was applied.

Fig. 12: The Stage 3 hifv_flagdata task page.
CHECK for: the percentage of the flags. If a very large portion (or even all) of the visibilities of the calibrators are flagged, try to find out the reason. Also have a quick look at the graph of the online flags to understand whether the system behaved normally or if there was an unusually high failure of some kind.
QA: Determined by the percentage of incremental flagging where 0 < score < 1 is the score range for 60% < fraction flagged data < 5%.

Stage 4. hifv_vlasetjy: Set calibrator model visibilities

Stage number 4 calculates and sets the calibrator spectral and spatial model for the standard VLA flux density calibrators (3C48, 3C138, 3C147, or 3C286 with a CALIBRATE_FLUX scan intent). The task page (Fig. 13a) lists the calculated flux densities for each spectral window (spw). It also contains plots of the amplitude versus uv-distance for the models per spw that are calculated and used to specify the flux density calibrator characteristics (Fig. 13b). If the scan intent CALIBRATE_FLUX is absent or the calibrator not a standard VLA flux density calibrator, the absolute flux density scale calibration will be on an arbitrary level.

Fig. 13a: The Stage 4 hifv_setjy task page.
Fig. 13b: The Stage 4 hifv_setjy task page, calibrator models.
CHECK for: any unexpected flux densities or model shapes.
QA: If the flux calibrator is not one of the VLA flux standards (3C48, 3C138, 3C147, 3C286), the score will be 0.5. 

Stage 5. hifv_priorcals: Priorcals (gaincurves, opacities, antenna positions corrections and rq gains)

Next, the prior calibration tables are being derived. They include gain-elevation dependencies, atmospheric opacity corrections, antenna offset corrections, and requantizer (rq) gains. They are independent of the calibrator observations themselves and can be derived from ancillary data such as antenna offset tables, weather data, antenna elevation, and switched power measurements.

Opacities are calculated per spw and plotted together with additional information on the weather conditions during the observation (Fig. 14a).

Fig. 14a: The Stage 5 hifv_priorcals task page.
Fig. 14b: The Stage 5 hifv_priorcals task page, continued.

The antenna positions are usually updated within a few days after an antenna was repositioned during the cycle, and for our case corrections (on the order of a few millimeters) for four antennas are applied.

CHECK for: extreme or unrealistic opacities. Also check that the antenna offsets are within are reasonable range (reasonable values are usually less than +/- 0.0200 meters). There should only be updates for a few antennas.
QA: N/A; but a warning will be issued when more than 50% of antennas need position corrections, or when the weather station data are absent for observations at K-band frequencies and above.

Stage 6. hifv_testBPdcals: Initial test calibrations

Now it is time to determine the delays and the bandpass solutions (gain and phase) for the first time. Applying the initial solution will make it easier to identify RFI that needs to be flagged. There will be a couple of similar iterations for the calibration tables in the following pipeline stages to eventually obtain the final set of calibration tables.

A warning is shown for antenna ea21 to point out that spectral windows 50 to 65 were completely flagged.


The plot on the main page (Fig. 15) shows the bandpass calibrator with the initial bandpass solutions applied. There are links to other plots showing delay, gain amplitude, gain phase, bandpass amplitude, and bandpass phase solutions for each antenna. Note that the pipeline will typically switch reference antennas so phase solutions of reference antennas may not be perfectly zero and show some steps (an example will be shown later). When delays are more than +/-10ns it will be worth examining the data more closely. Some additional flagging may be needed.

Fig. 15: The Stage 6 hifv_testBPdcals task page.

The gain amplitude and phase solutions are derived per integration and they are used to correct for decorrelation before any spectral bandpass solutions are calculated. The latter are determined over a full solution interval, usually for all bandpass scans together. Bandpasses should be smooth although they can vary substantially over wide frequency bands. The bandpass (BP) phase solutions are derived after systematic slopes were accounted for by the delay solutions.


Example delays are shown in Fig. 16: The delays for ea16 vary but are within a narrow range of only a few ns. These are good solutions. The delays for ea21 are fine except for the 33-35GHz frequency range where many solutions failed or scatter substantially. The respective frequency range/spectral window (spw) should be flagged manually (best through a flagging template) if the following pipeline steps will not take care of it. For ea22 the delays in the 35-37GHz range are excessive with a value of about -68ns. It is likely that the pipeline will be able to calibrate these values correctly but one may need to flag the respective spws if not.

Fig. 16a: Delays for ea16.
Fig. 16b: Delays for ea21.
Fig. 16c: Delays for ea22.


In Fig. 17, we show some of gain amplitude plot examples. Antenna ea03 shows credible solutions (the colors represent different spectral windows and polarizations; an amplitude spread as a function of frequency is expected given the spectral index of the source), whereas ea04 has elevated values until 8:06. Those should be flagged (but the pipeline may be able to detect and flag them in one of the subsequent stages). Some of the baselines in ea18 show low values in the 2-3Jy range, but they are constant in time. At this stage one can assume that they reflect the correct calibration values. It might still be worth making a note and check if calibration downstream was applied correctly. The situation is different for ea25 which shows an extreme decrease of amplitude as a function of time (also ea18 shows that in the last few integrations). This is likely an antenna mechanical error. This antenna should be inspected carefully, there could be a problem which will make it unusable. Although the bandpass solutions seem to be ok, the bandpass and flux density calibrators coincide and it is likely that the absolute flux density calibration is very unreliable for this antenna.

Fig. 17a: Gain Amplitude for ea03.
Fig. 17b: Gain Amplitude for ea04.
Fig. 17c: Gain Amplitude for ea18.
Fig. 17d: Gain Amplitude for ea25.


Since the gain amp/phase steps per integration are only performed to reduce decorrelation, the phase plots are the most important diagnostics in this context. In Fig. 18 we show a few solutions. All phases for the reference antenna ea09 are by definition zero. The phase variations as a function of time increase for higher frequencies and longer baselines. Therefore both, ea03 and ea21 have good solutions given that ea03 is closer to to the reference ea09 than ea21 (cf. the Antenna Setup on the Overview page). There are no jumps in the phases - remember that -180 and +180 are identical phase values and lines connecting those values are only a plotting issue, not the actual phase behavior.

Fig. 18a: Gain Phase for ea09.
Fig. 18b: Gain Phase for ea03.
Fig. 18c: Gain Phase for ea21.


Now let's have a look at the bandpasses themselves (Fig. 19). Antenna ea17 shows very good bandpass solutions. Since the spectral windows (spws) are small compared to the entire frequency range, the edges of each spw dominate the variations. The 37-39GHz range of ea18 varies considerably more. In fact this antenna, polarization and baseband shows a deformatter timing erro, which needs to be flagged. Some flagging was already performed for the 33-35GHz range of ea21; this is mentioned by the warning at the top of the page (note, that the warning shows antenna ids not antenna names, the mapping is shown in the Antenna Setup page, here antenna 20 is the id for ea21). This range corresponds to the failed and noisy delays that we saw earlier in Fig. 16b. Antenna ea24 shows a few high values. They usually are fine as they also correspond to the edges of the spws. In particular if an spw edge coincides with a baseband edge, such spikes are usually more pronounced. Keep an eye on those although they are likely not a problem for the calibration. Finally, we show the bandpass of ea25, the antenna with the likely mechanical error. Although the Gain Amplitude showed decreasing values as a function of time (Fig. 17d), the bandpass itself does not look suspicious and can likely be used, based on this plot. The mechanical error, however, may also be present for other scans and since we identified it first for a flux density calibrator scan, that antenna should be flagged.

Fig. 19a: BP Gain for ea17.
Fig. 19b: BP Gain for ea18.
Fig. 19c: BP Gain for ea21.
Fig. 19d: BP Gain for ea24.
Fig. 19e: BP Gain for ea25.

The bandpass (BP) phases as a function of frequency/channel are shown in Fig. 20. Again the reference antenna ea09 only shows zero phases by definition. Antenna ea11 is an example of proper phase solutions across the bandpass. Note again that edges of the spectral windows are showing the largest deviations. Some variations are larger than others, but they are all in a similar range. We already saw the large scatter in the bandpass amplitude of ea18 at 37-39GHz due to a signal path (bad deformatter) problem and the pattern is apparent in the phases. Finally, we show ea24 again and find that the edge spike in the amplitudes is also seen in the phases. At this level, the solution should be usable.

Fig. 20a: BP Phase for ea09.
Fig. 20b: BP Phase for ea11.
Fig. 20c: BP Phase for ea18.
Fig. 20d: BP Phase for ea24.
CHECK for: strong RFI and whether it was eliminated in later flagging stages or not (especially via a comparison with the output plots of task 14). Also check for jumps in phase and/or amplitude away from spectral window edges. If there are phase jumps for all but the reference antenna, maybe a different choice for the reference antenna should be considered. Also watch out for extreme delays of tens of ns and for very noisy data. 
QA: checks are performed for the presence of delay and bandpass solutions for all science spws and antennas. The fraction of failed bandpass solutions changes the score to 0 < score < 1 for 60% < failed solutions < 5%. The score is furthermore reduced by 0.1 for every antenna where delays exceed >200ns. 

Stage 7. hifv_flagbaddef: Flag bad deformatters

The digital transmission system (DTS) of each VLA antenna includes a formatting stage to convert the electronic to an optical signal before it is injected on the optical fiber link. On the correlator end the signal will be deformatted back to an electronic signal. Occasionally, the timing on the deformatter can be misaligned which results in very strong amplitude or phase slopes as a function of frequency. Sometimes the signal is similar to an abs(sin), or a 'bouncing' signal across a baseband for one polarization. The hifv_flagbaddef pipeline stage tries to identify such deformatter errors by checking for deviations more than 15% over the average bandpass. If more than 4 spws of a baseband are affected this way, the entire baseband will be flagged.


For our data, no deformatter issues were automatically detected in the data. We did see, however, that ea18 has a DTS problem in the 37-39GHz baseband (Figs. 19b/22a and 20c). Since this stage 7 did not detect and flag this range (which shows the limitations of the underlying code), manual flagging will be required for the affected antenna, polarization, and baseband for all sources. An example from a different dataset is provided in Fig. 22b. The 'V' shape close to 5.3 GHz with some values reaching close to zero are a sign for a deformatter problem.

File:VLApipe-flagbaddef-CASA6.1.2.png
Fig. 21: The Stage 7 hifv_flagbaddef task page.
Fig. 22a: Same as 19b, ea18 which shows a digital trasmission issue that hifv_flagbaddef was not able to identify.
Fig. 22b: An example for a bad deformatter from a different dataset.
CHECK for: amplitude 'bounces', i.e. very strong variations of amplitude and/or phase well above the average of the other polarizations or basebands. The pattern can repeat a few times across a baseband but should be contained to a single baseband, antenna and polarization. Data for all sources in the spectral windows in a faulty baseband are affected. 
QA: linearly scale score from 1 to 0 for fraction of affected antennas between 0% and 30%. 

Stage 7. hifv_checkflag: Flag possible RFI on BP calibrator using rflag

Rflag as part of flagdata is a threshold-based automatic flagging algorithm in CASA. In this step, rflag is run on the bandpass calibrator to remove relatively bright RFI and to obtain improved bandpass calibrations tables later on.

CHECK for: nothing in particular on this page, but some cumbersome RFI may have been eliminated in preparation for the following steps.
QA: determined by the percentage of incremental flagging where 0 < score < 1 is the score range for 60% < fraction flagged data < 5%.

Stage 8. hifv_semiFinalBPdcals: Semi-final delay and bandpass calibrations

Now that some RFI was flagged, stage 6 is repeated here at stage 8, which results in better bandpass and delay solutions. The warning is the same as in stage 6.

CHECK for: strong RFI and whether it was eliminated in later flagging stages or not (especially via a comparison with the output plots of task 14). Also check for jumps in phase and/or amplitude away from spectral window edges. If there are phase jumps for all but the reference antenna, maybe a different choice for the reference antenna should be considered. Also watch out for extreme delays of tens of ns and for very noisy data. 
QA: checks are performed for the presence of delay and bandpass solutions for all science spws and antennas. The fraction of failed bandpass solutions changes the score  to 0 < score < 1 for 60% < failed solutions < 5%. The score is furthermore reduced by 0.1 for every antenna where delays exceed >200ns. 

Stage 9. hifv_checkflag: Flag possible RFI on BP calibrator using rflag

Once more, rflag is executed. After the bright RFI has been removed in step 8 and a new bandpass solution has been applied in step 9, a new flagging threshold will account for weaker RFI, which will be removed in this step 10.

CHECK for: RFI that disappears in the following steps.
QA: determined by the percentage of incremental flagging where 0 < score < 1 is the score range for 60% < fraction flagged data < 5%.

Stage 10. hifv_semiFinalBPdcals: Semi-final delay and bandpass calibrations

Again, having removed more RFI, new delay and bandpass solutions are obtained here. The warning from stage 6 reappears.

CHECK for: strong RFI and whether it was eliminated in later flagging stages or not (especially via a comparison with the output plots of stage 14). Also check for jumps in phase and/or amplitude away from spectral window edges. If there are phase jumps for all but the reference antenna, maybe a different choice for the reference antenna should be considered. Also watch out for extreme delays of tens of ns and for very noisy data. 
QA: checks are performed for the presence of delay and bandpass solutions for all science spws and antennas. The fraction of failed bandpass solutions changes the score  to 0 < score < 1 for 60% < failed solutions < 5%. The score is furthermore reduced by 0.1 for every antenna where delays exceed >200ns. 

Stage 11. hifv_solint: Determine solint and Test gain calibrations

For the final calibration, the pipeline determines the shortest and longest applicable solution interval (solint). Typically they refer to the (longest) visibility integration time and the length of the longest gain calibration scan, respectively.

In our case (Fig. 23) the longest time per integration is 3 seconds which therefore also corresponds the shortest solution interval. The longest solution interval is based on the longest phase calibrator scan, which lasts for ~85s. When subtracting the slew time and allowing for 'quack' flagging of the longest solution interval, the longest solution interval results in ~76s.

Fig. 23: The Stage 12 hifv_solint task page.

Temporal gain and phase solutions are calculated for each antenna, spectral window, and polarization using these time intervals. In Fig. 24 we show some examples for the gains. Antenna ea03 shows consistent gain solutions with small variations over the time of the observations. Note that the last scan is the flux density calibrator and thus a different source with a different gain amplitude. Antenna ea04 shows increased values for the last few calibrator scans that may need to be flagged. This could be due to a bad pointing solution (scan 54 is a pointing, cf. the listobs output). Antenna ea25 has likely a pointing error that deteriorates over the first half of the observations. The listobs output tells us that a pointing update was obtained around 6:40UT at which point ea25 indeed recovered and shows good solutions.

Unfortunately, the plotting algorithm produces a somewhat convoluted plot with lines that criss-cross. The algorithm always connects only points that have the same spw, antenna, antenna2, and correlation id, with interruptions for flagged points. This can result in connectors like those shown between the short integration intervals.

Fig. 24a: Gain versus Time for ea03.
Fig. 24b: Gain versus Time for ea04.
Fig. 24c: Gain versus Time for ea25.


Although the phase solution plots are very crowded (Fig. 25), we can see that ea03 has very steady values over time. The pipeline will apply phase corrections that are determined from this solution so that, later on, additional phase solutions will be close to zero. Antenna ea04 shows larger variations. Antenna ea09 is the initial phase reference antenna. The underlying gaincal command, however, was given a few possible reference antennas, ea09, ea14, ea13, ea03, in case a single reference is not usable for all times and spectral windows. Check the CASA log for stage 12 at the bottom for the actual command. Indeed gaincal decided to chose different reference antennas for the solutions as the CASA log reports. To keep the phase interpolation consistent, ea09 phases have to absorb the offsets introduced by the alternate reference antennas. This explains the plot that we see here, i.e., not a constant zero for all spectral window phases.

Fig. 25a: Phase versus Time for ea03.
Fig. 25b: Phase versus Time for ea04.
Fig. 25c: Phase versus Time for ea09.
CHECK for: consistency with the data. The shortest solution interval should be close to the (longest) visibility integration time and the longest
gain calibration scan. Gains should be smooth with little variations in time (where larger gain variations are more likely to occur for higher
frequencies), phases should not show any jumps and should be relatively smooth in time (where larger phase variations are likely to occur for higher frequencies and longer baselines).
QA: N/A; but a warning will be issued when the long and short solint values are the same +/- one integration.

Stage 12. hifv_fluxboot2: Gain table for flux density bootstrapping

Now, the fluxes are bootstrapped from the flux density calibrator to the complex gain (amplitude and phase) calibrator. To do so, polynominal functions are fitted for the secondary calibrators and the absolute flux densities are determined for each channel. They are then inserted in the MODEL column via setjy and reported for each spectral window.

For our example, the pipeline derives flux densities between 0.65 and 0.70 Jy for the phase calibrator, depending on frequency. The value at the reference frequency of 34.9GHz is 0.6834+/-0.0007Jy (note that the error is only for the fit and not an overall error including systematics). The algorithm decided to fit a first order polynomial and derives a spectral index of -0.32 (Fig. 26); a negative spectral index corresponds to a decline in flux density with increasing frequency. The residuals of the fit and the logarithmic fit to the data points itself are shown in the last two plots at the top. The resulting flux gaintable ('fluxgaincal.g') is shown in the first plot and the second one contains the flux density calibrator model together with the bootstrapped, fitted gain calibrator models. If the plots show RFI or badly calibrated data, it is possible to edit the 'fluxgaincal.g' file using e.g. plotms. The VLA pipeline webpage has instructions on how to insert an edited table into the calibration run.

Our example has a small degradation in the maximum rms of the residuals and reports this through an information '?' icon.

Fig. 26: The Stage 13 hifv_fluxboot2 task page.
CHECK for: that the values are close to the known fluxes of the calibrator. Check the VLA calibrator manual at https://science.nrao.edu/facilities/vla/observing/callist for consistency. Since most calibrator sources are time variable AGN, some differences to the VLA catalog are expected. In particular at higher frequencies they could be up to tens of percent.
QA: based on the S/N and maximum residual of the fit. A fraction of 0.01 is deducted from a max score of 1.0 for each residual that is more than 1 sigma away from the mean. This value is calculated per source and normalized over all sources. 

Stage 13. hifv_finalcals: Final Calibration Tables

The final calibration tables are now derived (with the same warning as in previous BPDdcals steps). Those are the most important ones given that they are actually applied to the data in stage 15. The tables, which contain antenna based solutions, are: Final delay, bandpass (BP) initial gain phase, BP Amp solution, BP Phase solution, Phase (short) gain solution, Final amp time cal, Final amp freq cal, and Final phase gain cal. We have already inspected and discussed similar solutions for the bandpass and for the temporal gain/phase calibration earlier. We shall now investigate further, starting with the temporal gains.

The gains vary significantly for this observation. Typically, the gains stay within 10% around a normalized value of 1. Here, a few spectral windows (spws) show substantial deviations. Examples are (Fig. 27): Antenna ea02 has a drop around 5:50UT and should be checked (also ea01 which is not shown). Maybe the entire time between the adjacent, good calibrator scans should be flagged for this antenna. Antenna ea04 has an inverse behavior later, around 8:00UT. It appears that only a subset, e.g., a baseband, deviates from the rest. Antenna ea07 is more smooth, with some variations between the individual spws but overall a consistent temporal behavior. Likely, this solution can be used with no further flagging. Note that the last scan is the flux density calibrator. This scan is expected to have different gains from those of the complex gain calibrator. Next, note that the ea09 gains are almost unity, which is expected. The gains in ea18 are smooth with a large dip in the first half. This, in fact, does calibrate some characteristics of the observations and can be left for the moment. As mentioned before, around 6:40UT, a pointing update was performed which seems to have rectified a possibly mis-pointed ea18. Antenna ea23 requires a single spectral window at a single time to be investigated and probably be flagged. The mechanical error that we have identified for the bandpass/flux calibrator scan using ea25 has affected the phase solutions. That explains the amplitude spread of the spws. In addition, this antenna also has a pointing error for the first half of the observation. We again recommend to flag the entire antenna.

Fig. 27a: Temporal Gains for ea02.
Fig. 27b: Temporal Gains for ea04.
Fig. 27c: Temporal Gains for ea07.
Fig. 27d: Temporal Gains for ea09.
Fig. 27e: Temporal Gains for ea18.
Fig. 27f: Temporal Gains for ea23.
Fig. 27g: Temporal Gains for ea25.


Now let's have a look at the gains as a function of frequency (Fig. 28). For ea02 we see that one line is below the rest. This is likely one specific time interval and indeed we have seen such a slip in Fig. 27a. Antenna ea04 has a very noisy time interval, which is also in agreement with what we have seen in the previous temporal gain plot. Antenna ea08 shows a consistent calibration and ea20 repeats the extra noise in the 34-35GHz range that may need to be flagged. Antenna ea25 now reflects the bandpass pattern that we have seen earlier and that explains the spread in Fig. 27.

Fig. 28a: Spectral Gains for ea02.
Fig. 28b: Spectral Gains for ea04.
Fig. 28c: Spectral Gains for ea08.
Fig. 28d: Spectral Gains for ea20.
Fig. 28e: Spectral Gains for ea25.


The phases versus time ("Final phase gain cal") plots are shown in Fig. 29. Antenna ea02 clearly shows very erratic phase variations for one baseband or polarization. This is likely not recoverable. The user may plot the solution in plotcal or plotms and Locate the faulty spectral windows or polarizations and flag them. Antenna ea04, in contrast, exhibits very smooth phase variations until near the end of the observations. This has already been observed in the amplitude gains (Fig. 27b), should be looked at, and likely needs to be flagged. Antenna ea09 is the reference antenna and therefore has phase solutions that are zero as function of time. Antenna ea13 shows smooth variations and is an example for a credible calibration table. A spread between basebands or polarizations can be seen for ea15. The behavior is nevertheless smooth and the data should be calibrated nicely with this table. Antenna ea17, however, has, in addition to different behaviors for the basebands, also relatively large and erratic jumps between the calibration scans. This clearly needs to be looked into further and may require flagging, although the antenna did not show any issues in previous plots. Finally, ea20 has a relatively smooth behavior until the pointing update was performed (although the variations are relatively large). After the pointing scan, however, phases vary by about +/-50degree between individual, consecutive calibrator scans, which is large enough to be unreliable and to be flagged.

Fig. 29a: Temporal Phases for ea02.
Fig. 29b: Temporal Phases for ea04.
Fig. 29c: Temporal Phases for ea09.
Fig. 29d: Temporal Phases for ea13.
Fig. 29e: Temporal Phases for ea15.
Fig. 29f: Temporal Phases for ea17.
Fig. 29g: Temporal Phases for ea20.
CHECK for: strong RFI and whether it was eliminated in later flagging stages or not (especially via a comparison with the output plots of task 14). Also check for jumps in phase and/or amplitude away from spectral window edges. If there are phase jumps for all but the reference antenna, maybe a different choice for the reference antenna should be considered. Also watch out for extreme delays of tens of ns and for very noisy data. 

Note that carefully checking calibrator tables in this stage is of particular importance as they are the final tables that are applied to the target source. Phase (and gain) calibration solutions should be inspected in their temporal variations to be smooth and consistent for each calibrator.
QA: checks are performed for the presence of delay and bandpass solutions for all science spws and antennas. The fraction of failed bandpass solutions changes the score  to 0 < score < 1 for 60% < failed solutions < 5%. The score is furthermore reduced by 0.1 for every antenna where delays exceed >200ns. 

Stage 14. hifv_applycals: Apply calibrations from context

The calibration itself now concludes with the application of the derived calibration tables to the entire dataset. That includes all calibrators as well as the target sources. Note that there is no system temperature weighting of the calibration tables for the VLA (and the pipeline sets calwt=False) since the switched power calibration is currently not used.

In Fig. 30, we show the results of this step. The first table lists the calibration tables that are applied, and the fields, spectral windows, and antennas that are calibrated (although note that the spw 0 and 1 are only used for pointing scans and are not calibrated, despite them being listed here). The table also shows the field and spw mapping that was uses as well as the interpolation mode (see applycal for the interpretation). For convenience, the final, actually applied calibration table names are available through the links in the last column. The second table provides information on the flagging statistics. Failed calibration solutions result in flagged calibrator table entries and eventually the data will also be flagged as no calibration can be derived for such data.

Fig. 30: The Stage 15 hifv_applycals task page.


CHECK for: resonable flagging statistics. If the flagging increased dramatically, some calibration tables should be examined for proper solutions.
QA: determined by the percentage of incremental flagging where 0 < score < 1 is the score range for 60% < fraction flagged data < 5%.

Stage 15. hifv_targetflag: Targetflag

After the calibration tables are applied, the flagdata automated flagging routine rflag is run one more time on all sources to remove RFI and other outliers from the data.

CHECK for: RFI removal in the target data (use plotms). Although flagging is performed for all fields, the calibration is applied in a previous stage and any additional flags have no more influence on the calibration tables. Flagging may, however, improve all images that are made in the following stages. In particular the target fields are flagged here for the first time which will benefit their image quality. FOR SPECTRAL LINE DATA: do not run this step unless a cont.dat file is provided (c.f. the VLA Pipeline Webpage at http://go.nrao.edu/vla-pipe). Otherwise the spectral lines may be flagged, too.
QA: determined by the percentage of incremental flagging where 0 < score < 1 is the score range for 60% < fraction flagged data < 5%.

Stage 16. hifv_statwt: Reweighting visibilities

Since the VLA pipeline is currently not using the switched power calibration, there can be some sensitivity variations of the data over time due to changes in opacity, elevation, temperature (gradients) of the antennas, etc. So it is usually advisable to weigh the data according to the inverse of the square of their noise. This is done via the CASA task statwt and will increase the signal-to noise ratio of images. Note that features such as RFI spikes and spectral lines will influence RMS calculations and usually result in down-weighting data that includes such features.

CHECK for: there is no obvious diagnostic plot for this step but images that are created later should improve in their signal-to-noise. FOR SPECTRAL LINE DATA: do not run this step as spectral lines may be weighted down unless a cont.dat file is provided (cf the VLA Pipeline Webpage at http://go.nrao.edu/vla-pipe).
QA: N/A

Stage 17. hifv_plotsummary: VLA Plot Summary

This task produces diagnostic plots of the final, calibrated data. This includes calibrated phase for all targets as a function of time and amplitudes as a function of UVwave (UVdistance) for all science targets (incl. calibrators; Fig 32a), the amplitudes and phases as a function of frequency for the phase calibrator(s) colored by antenna (Fig. 32b), and amplitude against frequency for all science targets, colorized by spw (Fig. 32c).

The phases for all calibrators should be around zero. A large scatter like at 5:50 UT should be inspected and could indicate that more flagging is required. Indeed we had already identified scans at this time range to show some problems in some antennas (e. g. ea01 and ea02; cf Fig. 27a). Point sources with a flat spectral index should show up in the UVwave plots as straight, horizontal lines. This can be best seen for the phase calibrator J1041+0610 (field 2), which, however, still shows some scattered points due to the calibration uncertainties that we discussed earlier. Additional flagging should reduce the number of outliers. The flux calibrator J1331+3030 (field 12) shows spectral and spatial structure. Spectral variations increase the vertical width and spatial structure is identified by a deviation from a flat line as a function of UVwave. This is all expected for the VLA standard and the reason why we use a frequency dependent flux and spatial model for those sources in hifv_vlasetjy. Field 11 is also the flux calibrator, but for this run it was labeled with a TARGET intent, so the calibration was bootstrapped from the last gain calibrator, which was far away from the source. (At the time of the observations TARGET was frequently used for requantizer gain calibrations as the SETUP intent was not available then. This scan should be entirely flagged.)

The phases and amplitudes as a function of frequency for the phase calibrator show some internal structure and non-Gaussian scatter but overall look fine. The slope is due to a spectral index of the source. Again some of the scatter can be reduced with additional flagging. The zig-zag pattern of the phases is due to a small mismatch in the delay measurement timing (also known as 'delay clunking'). This is an internally generated effect. Typically the effect is averaged out over time.

The amplitudes as a function of frequency for all targets can be used to identify rfi or other problems with the data. Structure is only expected for very strong target sources.


Fig. 32a: The Stage 18 hifv_plotsummary task page; amplitudes as a function of UVwave.
Fig. 32b: The Stage 18 hifv_plotsummary task page; amplitudes and phases against frequency for phase calibrator.
Fig. 32c: The Stage 18 hifv_plotsummary task page; amplitudes agains frequency for all science targets.
CHECK for: outliers, jumps, offsets, and excessive noise.
QA: N/A

Stage 18. hif_makeimlist (cals): Compile a list of cleaned images to be calculated

Finally, diagnostic images are made for each receiver frequency band by combining all covered spws for calibrators with a PHASE or BANDPASS intent (note that in our case the bandpass was derived from the FLUX calibrator as no BANDPASS intent was present). The images and basic parameters such as pixel resolution (cell size) and image sizes are listed in this step. The images are available in the directory in which the pipeline was executed (usually where the SDM is located). Images are produced for each receiver frequency band using the multi-frequency synthesis algorithm, i.e. in continuum mode corrected for spectral dependencies using the stretched uv-coverage as sampled by the observed channel frequencies.

The error message is due to the unusual intent labeling and should not occur with newer data.


Fig. 33: The Stage 19 hifv_makeimlist task page.
CHECK for: appropriate cell size for the images.
QA: N/A

Stage 19. hif_makeimages (cals): Calculate clean products

The images from the previous stage are shown in this final pipeline task.

Imaging parameters are provided for each image (Fig. 34). They contain beam characteristics as well as image statistics.

The warning is only issued to inform the user that deeper cleaning down to a threshold closer to the rms would be possible, so a few sidelobes may not have been fully removed.

Fig. 34: The Stage 20 hifv_makeimages task page.

The full range of clean results can be accessed by the link under the images: "View other QA images...". For the first field, an example is given in Fig. 35. "Other QA images" will include: image, residual, and clean mask in the first row; the dirty image on the second row; and the primary beam, psf, and final model in the third row. A clean mask is shown, too, but it is currently not created for the VLA pipeline.

Fig. 35: All QA images.


CHECK for: degraded images, strong ripples, calibrators that do not resemble the point spread function (psf). Such images may indicate RFI or mis-calibrated sources. If the actual rms is far from the theoretical noise, this could indicate that deeper cleaning is required. But that may not be important for these calibrator images.
QA: a linear score between 0 to 1 is assigned for signal to noise ratios between 5 and 100. 

Stage 20. hif_exportdata: Create data products to be archived.

This stage is only shown for data that has been run by the NRAO production pipeline run and will not show up for manual execution of the pipeline. It provides information on the data products that are available to the archive. Weblogs, and calibration files are bundled and zipped in addition to the flags. More information on the files and the restoration process is provided on the VLA pipeline homepage.

Fig. 36: Exportdata screen.


CHECK for: N/A
QA: N/A


Last checked on CASA Version 6.1.2