Pipeline: Frequent VLA problems
General Description
The VLA pipeline delivers calibrated data and some initial images of VLA observation runs. The quality of the calibration and imaging products is usually assessed through the weblog that is created in each pipeline run (see also the VLA Pipeline guide). During the observations, the VLA may have encountered technical problems that are reflected in various ways in the weblog, where graphs show the behavior of the calibration tables as a function of time, frequency, polarization, etc., and analytical numbers describe the amount of flagging, derived fluxes, image statistics, etc.
Here we would like to briefly describe common VLA observing problems, how they are identified in the pipeline calibration weblog, and how they can be addressed.
Radio Frequency Interference
By far the biggest problem is radio frequency interference (RFI). RFI is produced by internal and external sources, can be terrestrial or from satellites that operate at or spill into the observed frequency. For the VLA, please find more information on the Radio Frequency Interference webpage. Although weak RFI may only slightly raise the noise of an image with little influence on the calibration tables, stronger RFI will produce artifacts that may render the data (target and calibrators) unusable, if not adequately flagged. An example for strong RFI is shown below. Flagging procedures are outlined in the VLA topical CASA guide on flagging.
It is important that all data are free of RFI.
FIX: Weak, intermittent RFI will increase the noise and be down-weighted in the imaging in the hifv_statwt task. Strong RFI needs to be flagged and only clean data should be calibrated and imaged. Flagging can be manual or automatic.
Pointing
At higher frequencies the VLA requires regular pointing calibrations. Each pointing run will reposition the antennas to be centered on a strong source with known position. If the pointing solution fails, the amplitude of the source will drop or drift away from the center of the antenna with the highest gain. A typical graph looks like the one shown below. The pointing solution for the first half of the run failed, which results in the source drifting away from the center of the primary beam. After a pointing update in the middle of the run, the antenna is positioned properly again (the very last data points are actually a different source, hence the drop at the edge).
FIX: If the pointing is only off by a small amount, the gain calibration will take care of it. If it is off by a large amount, the data for this period and antenna needs to be flagged.
Phase Jumps
Various hardware failures can cause the phase for a given antenna to be unstable in time, often with sudden, large changes in phase over time. Depending on where the problem is, this may affect just a portion of the data or up to all data on a given antenna. In the example below, only one baseband's data is affected (large changes at each data point) while the other baseband remains near zero and is not affected. This plot, from the pipeline's Final phase gain cal section found in the 'hifv_finalcals' stage, shows the final phase solutions found for each calibrator (using the long solution interval).
Typical phase variations for low frequency data are a few degrees. For high frequencies tens of degrees can occur; the cycle time between the phase calibrator and the target needs to be reduced to adequately track and interpolate the phase variations as a function of time. If the phases vary more than 360 degrees between two phase calibrator scans, then the data are completely decorrelated and cannot be calibrated anymore (even changes larger than 180deg leave the interpolation pretty much undefined).
FIX: If there are phase jumps, usually the data for the affected time range needs to be flagged for the antenna(s).
DTS/Deformatter Problems
The digital transmission system (DTS) of each VLA antenna includes a formatting stage to convert the electronic signal to optical before it is injected on the optical fiber link. On the correlator end the signal will be deformatted back to an electronic signal. Occasionally the timing on the deformatter can be misaligned, which results in very strong amplitude or phase slopes as a function of frequency. When this occurs, the data are corrupt and the entire affected baseband per polarization of an antenna need to be flagged. Frequently the error shows up similar to an abs(sin) or a 'bouncing' signal across a baseband for one polarization or, in other terms, various numbers of 'V' shapes in the data, usually in the middle of a baseband.
Sometimes, however, the pipeline erroneously detects a DTS issue, when the data were in fact inly affected by RFI in a few spws. If that happens it is better to flag the data manually, which preserves the rest of the baseband.
FIX: The data for this baseband, antenna and polarization need to be flagged.
Correlator Zeros
Under some circumstances, the WIDAR correlator writes exact zeros. The pipeline will usually flag them automatically. If not, they can be removed with CASA's flagdata task, using the option mode='clip' with clipzeros=True or flag the zeros by hand.
The percentage is calculated based on channels. Spectral line spws are therefore more susceptible to a high reported number of zeros.
FIX: The pipeline will usually catch them. If not, use CASA's flagdata task.
Baseband and Subband Edges
If spw roll-off frequency edges are very steep, they can degrade gain and phase solutions. Frequently this is not a big problem, but if the gain for the edge channels is close to zero, a division by the bandpass for these channels can get extremely noisy. This is particularly true for baseband edges. The edgespw, fracspw, and baseband parameters in hifv_flagdata can be adjusted to flag different percentages of the edges (see also VLA pipeline pages). The edges can also be flagged with the CASA task flagdata, or by hand.
FIX: Adjust the relevant parameters in hifv_flagdata and re-run the pipeline.
Compression
Strong RFI can bring the the digital and analog receiver system into a non-linear regime (also known as compression). This is especially a problem in L and S bands. Simple RFI flagging alone will not be sufficient to remove compression. The affected antennas/spw/pols will likely need to be flagged.
FIX: For strong compression, flag the affected data. Weak compression may increase the Tsys.
Resolved Calibrators
Some calibrator sources are not perfect point sources. For the VLA standard flux calibrator sources, models are provided within CASA, which solves the problem for these. Resolved phase calibrators, however, will produce an more or less incorrect gaintable. In some cases, CASA's gaincal can even fail completely and set all fluxes to 1Jy. To work with resolved gain/phase calibrators, either provide a model, or, at least, restrict the uv-range to the unresolved portion during the solve. Plotting the amplitude against uv distance (uvwave) should clearly show the flat part that can be used, and the non-flat parts that should be omitted. Try to make sure though, that there are at least some baselines for every antenna available for the solve. Figures 6a and 6b show, respectively, the visibility data for an unresolved and a resolved calibrator.
A solution is to use the flux.csv table. It is usually generated for ALMA in a pipeline run, but can be created before a VLA run. The uv-ranges listed there will be used in the processing. The format is like:
ms,field,spw,I,Q,U,V,spix,uvmin,uvmax,comment
and an example entry would be:
MY.ms,0,2,1,0.0,0.0,0.0,0.0,21000.0,11000.0,"# 3C48"
for a uvrange of 21000-110000 lambda.
Above, ms is the MS name, field and spw are the IDs (not names, the ID will only be known once the data is in MS format and after executing listobs), I, Q, U, V are the Stokes flux densities in Jy (note that entries for the VLA will be ignored here, so a nominal I=1Jy will be ok), uvmin and uvmax are the uv ranges in units of lambda. Only one spw (the first) is used per field, other entries will be ignored. If uvmax is provided as 0 lambda, then this creates an inequality and uvmax is unbounded.
If you have multi-band data, you may have to split the data per band first, then run each band through their own pipeline to make use of flux.csv.
FIX: Restrict the uv-range for the calculations of the calibration tables. The flux.csv table can be used.
Wrong Intents
If the intents of the data are set incorrectly for the observations, the pipeline will use the wrong calibrators for the calibration. Usually this can be fixed by overwriting the intents. The VLA pipeline webpage provides instructions and a script to do this. For more complicated setups, like multiple calibrators or bands with separate calibrator scans, data may be split into smaller MSs that contain only the relevant calibrators for each target, or data reduction by hand may be needed.
Non-ideal reference antenna
Sometimes if the reference antenna has some issue, like RFI or extreme flagging, it is advisable to switch to a different reference antenna. The example below shows that one spw has extreme phase jumps for all antennas when ea02 was chosen as a reference antenna (Fig. 7a). This indicates that the phase jumps are likely not present on all antennas, but that phase instabilities on ea02 itself are reflected on all other antennas. Indeed, when ea09 was chosen as a reference antenna, as shown in Fig. 7b, then the instability is shown only in ea02 and all other antennas are well-behaved. Delays are also a quantity that are relative to a chosen reference antenna. If the delays for all antennas show similarly high delays, then it is likely that the reference antenna has the high delays and not all other antennas. Chosing a different reference antenna would quickly reveal if this is the case.
Use the 'refantignore' keyword to disallow the use of this antenna as a reference (in the example one should ignore ea02 as a possible reference antenna). The Pipeline Page provides details on the usage of this keyword.
FIX: Use 'refantignore' to remove a problematic antenna from the list of possible reference antennas.
Extreme Solution Intervals
In the hifv_solint stage, the short solution inteval is computed as the longest individual single integration (dump) of a visibility. The long solution interval is the longest scan on the gain/phase calibrator. If those values seem unreasonable,then the data should be inspected and flagged. Sometimes, the observations are set up with a long phase calibrator scan at the beginning, to allow for longer slews to the source. This can result in excessive long solints, and some flagging maybe advised on this scan.
After flagging, the pipeline should then be re-run to determine new solution intervals.
FIX: Flagging bad data.
Weather
At the VLA, the weather has to meet certain conditions to run a scheduling block. The conditions vary with frequency and are more stringent for higher frequency observations (settable by the PI). It can happen, however, that the weather deteriorates after a scheduling block has started. High water vapor content and moving atmospheric cells can increase the system temperature and introduce extreme phase jumps. Wind (gusts) will also change the phase stability and cause more frequent pointing errors. Flagging times of bad weather conditions may help. The CASA task statwt will down-weight some noise variations. Also selfcal (Topical Guide: VLA Self-calibration Tutorial) will correct for phase variations. In extreme cases, however, flagging is the only method.
For some SBs the weather data are missing from the header. This is usually not a big problem. The data can be filled, however, on request. Please contact the NRAO helpdesk.
FIX: Statwt, selfcal, or flagging.
Decorrelation
Decorrelation is an effect where the individual spatial frequencies of the visibilities are misaligned. If the misalignment is random the data is decorrelated, i.e., not all wave amplitudes are aligned, leading to destructive interference and thus a reduced amplitude. This effect is best seen in the plotsummary amolitude versus uv-wave plots. The biggest source of decorrelation is the atmosphere where a screen of a number of atmospheric cells with different refractive indices moves across the array, which causes errors in the delay and thus phase. The effect of decorrelation increases with observing time and is stronger for longer baselines. One correction for decorrelation is to increase the time between phase calibrator observations. At some time, however, decorrelation is constant (see the Advanced Calibration presentations at the NRAO synthesis school).
The pipeline will correct for some degree of decorrelation for all calibrators. In extreme cases, however, data need to be flagged. If decorrelation is strong, it can be assumed that the target also shows significant decorrelation. Self-calibration is advised if the source flux is sufficient. A CASA guide for self-calibration is provided in the Topical Guide: VLA Self-calibration Tutorial.
FIX: Self-calibration. In extreme cases: flagging.
Shadowing
The pipeline flags shadowed antennas by default. If not all of shadowing is captured, or if the shadowing criteria shall be loosened (e.g. allow a small amount of shadowing), then this can be controlled by the CASA task flagdata 'mode='shadow'. After manually flagging the data, he 'hifv_flagdata' task call should then be modified ('shadow=False') to not do additional shadowing flagging.
FIX: in CASA: flagdata mode='shadow'
RFI plots look worse after Flagging
In some instances, the post-RFI flagging plots look aesthetically worse than the pre-RFI flagging plots. This is due to a poorly performing antenna (higher noise than others) that is getting heavily flagged in the RFI flagging. The post-RFI flagging plot then has less data to average together resulting in a worse looking plot. This is not a problem and the outcomes from imaging with and without the flagging of these poorly performing antennas are not scientifically different. In the case of these noisy data not getting flagged (as in the previous pipeline version), they are strongly down-weighted by statwt so they do not contribute much to the final images anyway.
FIX: Nothing to fix, but inspect closely that this is only a plotting effect.
Delay differences
There's a known issue where in X-band the two polarizations show an offset of 16ns in their delays. This is not a problem as long as they are steady in frequency and will calibrate out. In general, delay differences larger than +/-10ns should be inspected and strong delay jumps due to RFI should be removed.
FIX: Likely flagging for strong delay differences. Sometimes it is advisable to change to a different reference antenna.
OMT Reflection
This appears as a sinusoidal pattern of the frequency plots of the calibrators in the 'plotsummary' stage, a residual after applying the bandpass tables. An explanation for this is moisture on the feedhorn window that allows standing waves within the feedhorn and the OMT. It is most common in late summer.
FIX: No specific fix for this effect, but check the fractional change of the sinusoidal pattern. For continuum is should mostly average out, otherwise it adds to the error budget.
Other System Issues
Sometimes, the calibration amplitudes vs frequency plots show some features resembling of resonances, intermittent peaks or depressions, or swings on the finalcals page (amlitude vs frequency or time). A comparison with plotsummary will show if they calibrate out, or if there are residuals (non-symmetric noise in an antenna). If they are strong, maybe significantly more than 10%, the data may need to be flagged (potentially including the adjacent target scans), and the calibration pipeline restarted.
FIX: Stronger features should be flagged.
Dynamic Range
If a source is very strong, systematic errors will be amplified and well visible in images. It is then difficult to deconvolve the sources and systematic errors may dominate well over the thermal rms noise levels.
FIX: More careful calibration, additional calibration techniques such a position-dependent gain solutions, careful deconvolution, the use fo widefield or aw-projection gridders. Self-calibration. Generally such sources need to be treated by hand as the pipeline functions are limited.