Pipeline: Frequent VLA problems
The VLA pipeline delivers calibrated data and some initial images of VLA observation runs. The quality of the calibration and imaging products is usually assessed through the weblog that is created in each pipeline run (see also the VLA Pipeline guide). During the observations, the VLA may have encountered technical problems that are reflected in various ways in the weblog, where graphs show the behavior of the calibration tables as a function of time, frequency, polarization, etc., and analytical numbers describe the amount of flagging, derived fluxes, image statistics, etc.
Here we would like to briefly describe common VLA observing problems, how they are identified in the pipeline calibration weblog, and how they can be addressed.
Radio Frequency Interference
By far the biggest problem is radio frequency interference (RFI). RFI is produced by internal and external sources, can be terrestrial or from satellites that operate at or spill into the observed frequency. For the VLA, please find more information on the Radio Frequency Interference webpage. Although weak RFI may only slightly raise the noise of an image with little influence on the calibration tables, stronger RFI will produce artifacts that may render the data (target and calibrators) unusable, if not adequately flagged. An example for strong RFI is shown below. Flagging procedures are outlined in the VLA topical CASA guide on flagging.
It is important that all data are free of RFI.
FIX: Weak, intermittent RFI will increase the noise and be down-weighted in the imaging in the hifv_statwt task. Strong RFI needs to be flagged and only clean data should be calibrated and imaged. Flagging can be manual or automatic.
At higher frequencies the VLA requires regular pointing calibrations. Each pointing run will reposition the antennas to be centered on a strong source with known position. If the pointing solution fails, the amplitude of the source will drop or drift away from the center of the antenna with the highest gain. A typical graph looks like the one shown below. The pointing solution for the first half of the run failed, which results in the source drifting away from the center of the primary beam. After a pointing update in the middle of the run, the antenna is positioned properly again (the very last data points are actually a different source, hence the drop at the edge).
FIX: If the pointing is only off by a small amount, the gain calibration will take care of it. If it is off by a large amount, the data for this period and antenna needs to be flagged.
Various hardware failures can cause the phase for a given antenna to be unstable in time, often with sudden, large changes in phase over time. Depending on where the problem is, this may affect just a portion of the data or up to all data on a given antenna. In the example below, only one baseband's data is affected (large changes at each data point) while the other baseband remains near zero and is not affected. This plot, from the pipeline's Final phase gain cal section found in the 'hifv_finalcals' stage, shows the final phase solutions found for each calibrator (using the long solution interval).
Typical phase variations for low frequency data are a few degrees. For high frequencies tens of degrees can occur; the cycle time between the phase calibrator and the target needs to be reduced to adequately track and interpolate the phase variations as a function of time. If the phases vary more than 360 degrees between two phase calibrator scans, then the data are completely decorrelated and cannot be calibrated anymore (even changes larger than 180deg leave the interpolation pretty much undefined).
FIX: If there are phase jumps, usually the data for the affected time range needs to be flagged for the antenna(s).
The digital transmission system (DTS) of each VLA antenna includes a formatting stage to convert the electronic signal to optical before it is injected on the optical fiber link. On the correlator end the signal will be deformatted back to an electronic signal. Occasionally the timing on the deformatter can be misaligned, which results in very strong amplitude or phase slopes as a function of frequency. When this occurs, the data are corrupt and the entire affected baseband per polarization of an antenna need to be flagged. Frequently the error shows up similar to an abs(sin) or a 'bouncing' signal across a baseband for one polarization or, in other terms, various numbers of 'V' shapes in the data, usually in the middle of a baseband.
Sometimes, however, the pipeline erroneously detects a DTS issue, when the data were in fact only affected by RFI in a few spws. If that happens it is better to flag the data manually, which preserves the rest of the baseband.
FIX: The data for this baseband, antenna and polarization need to be flagged.
Under some circumstances, the WIDAR correlator writes exact zeros. The pipeline will usually flag them automatically. If not, they can be removed with CASA's flagdata task, using the option mode='clip' with clipzeros=True or flag the zeros by hand.
The percentage is calculated based on channels. Spectral line spws are therefore more susceptible to a high reported number of zeros.
FIX: The pipeline will usually catch them. If not, use CASA's flagdata task.
Baseband and Subband Edges
If spw roll-off frequency edges are very steep, they can degrade gain and phase solutions. Frequently this is not a big problem, but if the gain for the edge channels is close to zero, a division by the bandpass for these channels can get extremely noisy. This is particularly true for baseband edges. The edgespw, fracspw, and baseband parameters in hifv_flagdata can be adjusted to flag different percentages of the edges (see also VLA pipeline pages). The edges can also be flagged with the CASA task flagdata, or by hand.
FIX: Adjust the relevant parameters in hifv_flagdata and re-run the pipeline.
Strong RFI can bring the the digital and analog receiver system into a non-linear regime (also known as compression). This is especially a problem in L and S bands. Simple RFI flagging alone will not be sufficient to remove compression. The affected antennas/spw/pols will likely need to be flagged.
There is a known local microwave antenna that emits in C band near 6.2 GHz. The signal is strong enough to cause compression on many antennas, especially in A and B configurations. This can be identified at several places in the weblog. Examples from a data set where ea18 was strongly affected are shown below. The spw containing 6.2 GHz has been flagged by the pipeline, while the rest of the upper baseband (C:A2C2 in this case) is clearly compressed. The lower baseband, C:A1C1, is in good condition.
FIX: For strong compression, flag the affected data. Weak compression may increase the Tsys.
Some calibrator sources are not perfect point sources. For the VLA standard flux calibrator sources, models are provided within CASA, which solves the problem for these. Resolved phase calibrators, however, will produce an more or less incorrect gaintable. In some cases, CASA's gaincal can even fail completely and set all fluxes to 1Jy. To work with resolved gain/phase calibrators, either provide a model, or, at least, restrict the uv-range to the unresolved portion during the solve. Plotting the amplitude against uv distance (uvwave) should clearly show the flat part that can be used, and the non-flat parts that should be omitted. Try to make sure though, that there are at least some baselines for every antenna available for the solve. Figures 6a and 6b show, respectively, the visibility data for an unresolved and a resolved calibrator.
A solution is to use the flux.csv table. It is usually generated for ALMA in a pipeline run, but can be created before a VLA run. The uv-ranges listed there will be used in the processing. The format is like:
and an example entry would be:
for a uvrange of 21000-110000 lambda.
Above, ms is the MS name, field and spw are the IDs (not names, the ID will only be known once the data is in MS format and after executing listobs), I, Q, U, V are the Stokes flux densities in Jy (note that entries for the VLA will be ignored here, so a nominal I=1.0Jy will be ok), uvmin and uvmax are the uv ranges in units of lambda. Only one spw (the first) is used per field, other entries will be ignored. This means a single line entry such as the one above will apply to all spws for the given field. If uvmax is provided as 0.0 lambda, then this creates an inequality and uvmax is unbounded.
If you have multi-band data, you may have to split the data per band first, then run each band through their own pipeline to make use of flux.csv.
FIX: Restrict the uv-range for the calculations of the calibration tables. The flux.csv table can be used.
If the intents of the data are set incorrectly for the observations, the pipeline will use the wrong calibrators for the calibration. Usually this can be fixed by overwriting the intents. The VLA pipeline webpage provides instructions and a script to do this. For more complicated setups, like multiple calibrators or bands with separate calibrator scans, data may be split into smaller MSs that contain only the relevant calibrators for each target, or data reduction by hand may be needed.
Non-ideal reference antenna
Sometimes if the reference antenna has some issue, like RFI or extreme flagging, it is advisable to switch to a different reference antenna. The example below shows that one spw has extreme phase jumps for all antennas when ea02 was chosen as a reference antenna (Fig. 7a). This indicates that the phase jumps are likely not present on all antennas, but that phase instabilities on ea02 itself are reflected on all other antennas. Indeed, when ea09 was chosen as a reference antenna, as shown in Fig. 7b, then the instability is shown only in ea02 and all other antennas are well-behaved. Delays are also a quantity that are relative to a chosen reference antenna. If the delays for all antennas show similarly high delays, then it is likely that the reference antenna has the high delays and not all other antennas. Chosing a different reference antenna would quickly reveal if this is the case.
Use the 'refantignore' keyword to disallow the use of this antenna as a reference (in the example one should ignore ea02 as a possible reference antenna). The Pipeline Page provides details on the usage of this keyword.
FIX: Use 'refantignore' to remove a problematic antenna from the list of possible reference antennas.
Extreme Solution Intervals
In the hifv_solint stage, the short solution interval is computed as the longest individual single integration (dump) of a visibility. The long solution interval is the longest scan on the gain/phase calibrator. If those values seem unreasonable, then the data should be inspected and flagged. Sometimes, the observations are set up with a long phase calibrator scan at the beginning, to allow for longer slews to the source. This can result in excessive long solints, and some flagging may be advised on this scan.
After flagging, the pipeline should then be re-run to determine new solution intervals.
FIX: Flagging bad data.
At the VLA, the weather has to meet certain conditions to run a scheduling block. The conditions vary with frequency and are more stringent for higher frequency observations (settable by the PI). It can happen, however, that the weather deteriorates after a scheduling block has started. High water vapor content and moving atmospheric cells can increase the system temperature and introduce extreme phase jumps. Wind (gusts) will also change the phase stability and cause more frequent pointing errors. Flagging times of bad weather conditions may help. The CASA task statwt will down-weight some noise variations. Also selfcal (Topical Guide: VLA Self-calibration Tutorial) will correct for phase variations. In extreme cases, however, flagging is the only method.
For some SBs the weather data are missing from the header. This is usually not a big problem. The data can be filled, however, on request. Please contact the NRAO helpdesk.
FIX: Statwt, selfcal, or flagging.
Decorrelation is an effect where the individual spatial frequencies of the visibilities are misaligned. If the misalignment is random the data is decorrelated, i.e., not all wave amplitudes are aligned, leading to destructive interference and thus a reduced amplitude. This effect is best seen in the plotsummary stage amplitude versus uv-wave plots. The biggest source of decorrelation is the atmosphere where a screen of a number of atmospheric cells with different refractive indices moves across the array, which causes errors in the delay and thus phase. The effect of decorrelation increases with observing time and is stronger for longer baselines. One correction for decorrelation is to increase the time between phase calibrator observations. At some time, however, decorrelation is constant (see the Advanced Calibration presentations at the NRAO synthesis school).
The pipeline will correct for some degree of decorrelation for all calibrators. In extreme cases, however, data need to be flagged. If decorrelation is strong, it can be assumed that the target also shows significant decorrelation. Self-calibration is advised if the source flux is sufficient. A CASA guide for self-calibration is provided in the Topical Guide: VLA Self-calibration Tutorial.
FIX: Self-calibration. In extreme cases: flagging.
The pipeline flags shadowed antennas by default. If not all of shadowing is captured, or if the shadowing criteria shall be loosened (e.g. allow a small amount of shadowing), then this can be controlled by the CASA task flagdata 'mode='shadow'. After manually flagging the data, he 'hifv_flagdata' task call should then be modified ('shadow=False') to not do additional shadowing flagging.
FIX: in CASA: flagdata mode='shadow'
RFI plots look worse after Flagging
In some instances, the post-RFI flagging plots look aesthetically worse than the pre-RFI flagging plots. This is due to a poorly performing antenna (higher noise than others) that is getting heavily flagged in the RFI flagging. The post-RFI flagging plot then has less data to average together resulting in a worse looking plot. This is not a problem and the outcomes from imaging with and without the flagging of these poorly performing antennas are not scientifically different. In the case of these noisy data not getting flagged (as in the previous pipeline version), they are strongly down-weighted by statwt so they do not contribute much to the final images anyway.
FIX: Nothing to fix, but inspect closely that this is only a plotting effect.
There's a known issue where in X-band the two polarizations show an offset of 16ns in their delays. This is not a problem as long as they are steady in frequency and will calibrate out. In general, delay differences larger than +/-10ns should be inspected and strong delay jumps due to RFI should be removed.
FIX: Likely flagging for strong delay differences. Sometimes it is advisable to change to a different reference antenna.
This appears as a sinusoidal pattern of the frequency plots of the calibrators in the 'plotsummary' stage, a residual after applying the bandpass tables. An explanation for this is moisture on the feedhorn window that allows standing waves within the feedhorn and the OMT. It is most common in late summer.
FIX: No specific fix for this effect, but check the fractional change of the sinusoidal pattern. For continuum it should mostly average out, otherwise it adds to the error budget.
Other System Issues
Sometimes, the calibration amplitude vs frequency or vs time plots show features resembling resonances, intermittent peaks or depressions, or swings (particularly in finalcals stage). A comparison with plotsummary will show if they calibrate out, or if there are residuals (non-symmetric noise in an antenna). If they are strong, maybe significantly more than 10%, antenna-specific flagging may be needed for bad basebands, spws or channels, or bad scans (potentially including the adjacent target scans). Then the calibration pipeline should be restarted.
The issue has been identified in ea17 as being a loose cable on the C band receiver frontend. As of May 16, 2022, the issue should not appear anymore for that antenna in C band, but may still occur elsewhere.
FIX: Stronger features should be flagged.
If a source is very strong, systematic errors will be amplified and well visible in images. It is then difficult to deconvolve the sources and systematic errors may dominate well over the thermal rms noise levels.
FIX: More careful calibration, additional calibration techniques such a position-dependent gain solutions, careful deconvolution, the use fo widefield or aw-projection gridders. Self-calibration. Generally such sources need to be treated by hand as the pipeline functions are limited.
Reverse spw index
A known issue will sometimes cause the indexing of spws to be in reverse frequency order within one or more basebands. This occurs when there is a bad baseline board resulting in a spw being excluded. This doesn't harm the data, but it can be misleading when reviewing a weblog.
Poor spectral index fitting
A poor spectral index fitting for a phase calibrator in fluxboot stage can sometimes result in the corrected amps being discontinuous between spws, sometimes described as stairs/steps. This most often happens in multiband data sets since the data is harder to fit across all bands.
Flux calibrator models
The flux density scale calibrator 3C138 has been undergoing a flare since at least mid-December 2020. At K and Ka-bands the magnitude of the flare is of order 40-50% compared to the Perley-Butler 2017 flux scale. The effect is smaller at lower frequencies (10-20% at C and X-band), and is larger at higher frequencies (more than a factor of two at Q-band). If you care about the flux density scale of your observations at that level, monitoring datasets are publicly available in the archive under project code TCAL0009. From these observations, you may derive an updated flux density ratio to use for your observations.
When a flux cal is used only for polarization angle calibration (or any intents that do NOT include flux calibration), its CASA model is not used, so the resulting amp vs uvwave plots are inaccurate. In the fluxboot stage, sometimes gain solutions may be missing for some basebands/spws of the pol cal. If the cal is used exclusively for polarization (not flux calibration), then this has no effect on the overall calibration for Stokes I. Before performing any polarization calibration we recommend setting the flux cal model image with setjy, then running gaincal on it with calmode='p'.
A configuration structure
The current calibrator models for resolved flux density calibrators in A-configuration in CASA are created with narrow band data. At the high frequencies of a broad band observation, this can mean that the calibrator model does not represent the structure of the calibrator very well, leading to amplitude errors that reduce the dynamic range of the calibrated data. This will be remedied in a future version of CASA and the pipeline. In the meantime, if high image fidelity is required we have found that recalibration using a model for a higher frequency band scaled to the band of interest can work well.
In 2022 A config, we found the outermost antennas on the west arm were most affected.
Ka-band in A configuration
The Ka-band models used for pipeline processing and packaged with CASA do not represent their sources in A-configuration very well, causing amplitude errors. New, more accurate models will be released with CASA in the future.
In 2022 A config, we found short baselines between antennas on the E and W arms were most affected.