DataWeightsAndCombination: Difference between revisions
Line 29: | Line 29: | ||
To summarize the situation specifically as applied to ALMA data reduction in the ALMA archive and delivered to PIs: | To summarize the situation specifically as applied to ALMA data reduction in the ALMA archive and delivered to PIs: | ||
# '''CASA 4.2.1 and earlier''': Weights were only scaled by 1/[(Tsys(i) * Tsys(j)] using calwt=True at the applycal stage for Tsys table. Assuming that (1) there aren't any antennas with significant pointing errors (which causes low gain), (2) all the antennas have similar surface errors, and (3) antennas with very low gain have already been flagged -- usually good assumptions for ALMA data, data calibrated this way are (nearly) internally consistent and can produce good imaging results, but should not be combined with other data that have different Δν<sub>ch</sub>, t<sub>ij</sub>, or antenna size without further modification. An example of what you can do in this situation is given in XX. | |||
# '''CASA 4.2.2 and later''': Upon import data weights are scaled by 2Δν<sub>ch</sub>Δt<sub>ij</sub> and also scaled by 1/[(Tsys(i) * Tsys(j)] using calwt=True at the applycal stage for Tsys table. Additionally: | |||
** '''For data calibrated by the 4.2.2 CASA Pipeline''' the weights are further modified by [gain(i)<sup>2</sup> * gain(j)<sup>2</sup>] when the amplitude gain table is applied using calwt=True. Since the amplitude gains are directly proportional to the individual antenna sensitivities, scaling the weights by the amplitude gains will take into account antenna size differences, and also down-weight antennas with comparatively low gain. Thus, these weights are completely correct. | ** '''For data calibrated by the 4.2.2 CASA Pipeline''' the weights are further modified by [gain(i)<sup>2</sup> * gain(j)<sup>2</sup>] when the amplitude gain table is applied using calwt=True. Since the amplitude gains are directly proportional to the individual antenna sensitivities, scaling the weights by the amplitude gains will take into account antenna size differences, and also down-weight antennas with comparatively low gain. Thus, these weights are completely correct. | ||
** '''For data manually calibrated in CASA 4.2.2''', unfortunately calwt=False was still used to apply the antenna gain table, thus, these data have weights that are not correct in a relative sense when compared to other data with different antenna size by the factor [gain(i)<sup>2</sup> * gain(j)<sup>2</sup>]. | ** '''For data manually calibrated in CASA 4.2.2''', unfortunately calwt=False was still used to apply the antenna gain table, thus, these data have weights that are not correct in a relative sense when compared to other data with different antenna size by the factor [gain(i)<sup>2</sup> * gain(j)<sup>2</sup>]. | ||
# '''CASA 4.3 and later:''' Data calibrated in either the pipeline or manually will have completely correct weights. An example of this situation is demonstrated in https://casaguides.nrao.edu/index.php/M100_Band3_Combine_4.3 | |||
== How Do I Know the Situation For My Data?== | == How Do I Know the Situation For My Data?== |
Revision as of 19:33, 16 July 2015
This page is currently under construction.
Principles of Data Weighting
When combining data with disparate properties it is very important that the relative weights of each visibility be in the correct proportion to the other data according to the radiometer equation. Formally, the visibility weights should be proportional to 1/sigma2 where sigma is the variance or rms noise of a given visibility.
Assuming that the 7m and 12m antennas have similar aperture and quantization efficiencies (a reasonable assumption since they were designed this way), the rms noise in a single channel for a single visibility is:
[math]\displaystyle{ \sigma_{ij}=\frac{2k}{A_{eff}} }[/math] [math]\displaystyle{ \sqrt{\frac{T_{sys,i} T_{sys,j}}{\Delta\nu_{ch} t_{ij}}}, }[/math]
where k is Boltzmann's constant, Aeff is the effective antenna area and depends on the surface errors and antenna size as the radius2, Tsys,i is the system temperature for antenna i, Δνch is the channel width, and tij is the integration time per visibility.
Thus, in order to combine data that have different Tsys, Δνch, tij, or antenna size it is essential to use the correct data weights.
Weights in CASA
A memo describing weights in CASA, in particular the significant changes that were made with CASA 4.2.2, can be found at http://casa.nrao.edu/Memos/CASA-data-weights.pdf
To summarize the situation specifically as applied to ALMA data reduction in the ALMA archive and delivered to PIs:
- CASA 4.2.1 and earlier: Weights were only scaled by 1/[(Tsys(i) * Tsys(j)] using calwt=True at the applycal stage for Tsys table. Assuming that (1) there aren't any antennas with significant pointing errors (which causes low gain), (2) all the antennas have similar surface errors, and (3) antennas with very low gain have already been flagged -- usually good assumptions for ALMA data, data calibrated this way are (nearly) internally consistent and can produce good imaging results, but should not be combined with other data that have different Δνch, tij, or antenna size without further modification. An example of what you can do in this situation is given in XX.
- CASA 4.2.2 and later: Upon import data weights are scaled by 2ΔνchΔtij and also scaled by 1/[(Tsys(i) * Tsys(j)] using calwt=True at the applycal stage for Tsys table. Additionally:
- For data calibrated by the 4.2.2 CASA Pipeline the weights are further modified by [gain(i)2 * gain(j)2] when the amplitude gain table is applied using calwt=True. Since the amplitude gains are directly proportional to the individual antenna sensitivities, scaling the weights by the amplitude gains will take into account antenna size differences, and also down-weight antennas with comparatively low gain. Thus, these weights are completely correct.
- For data manually calibrated in CASA 4.2.2, unfortunately calwt=False was still used to apply the antenna gain table, thus, these data have weights that are not correct in a relative sense when compared to other data with different antenna size by the factor [gain(i)2 * gain(j)2].
- CASA 4.3 and later: Data calibrated in either the pipeline or manually will have completely correct weights. An example of this situation is demonstrated in https://casaguides.nrao.edu/index.php/M100_Band3_Combine_4.3
How Do I Know the Situation For My Data?
- Data taken in Cycle 0 and Cycle 1 were reduced in 4.2.1 or earlier versions and correspond to Situation 1 above. ACA 7m-array data were first offered in Cycle 1. If you want to combine 12m-array and 7m-array data from Cycle 1 is very likely you need to correct weights before imaging.
- The situation for Cycle 2 is more confused (this includes actual Cycle 2 projects and carry-over projects or parts of projects from Cycle 1).
- Key dates:
- Start Cycle 2: June 1, 2014
- CASA 4.2.2 release date: Sept. 4, 2014
- Pipeline release date: Oct. 20, 2014
What Are the Options for Adjusting the Weights for Older Reductions?
If the data weights are not correct in the data you want to combine there are three options to correct the situation. These different methods carry different levels of pain/complexity depending on your situation. For example, for data manually calibrated in 4.2.2, Option 1 is pretty easy, but harder for older data/ versions of CASA. The situation can also be extra confusing if your data fall into multiple categories above. For example, it is not uncommon that in Cycle 2 the 12m-array data could be pipeline calibrated, but the 7m-array data done manually.
Option 1: Re-calibrate your data in CASA 4.2.2 or later
Importing your data in 4.2.2 (or later) will automatically adjust the weights by 2ΔνchΔtij. The Tsys application should already be correct in your scripts. Be sure however, to change calwt=True for the amplitude table applycal.
- Caveat 1: You must have the raw ALMA ASDM to correctly start over, unless you are in the CASA 4.2.2 manual calibration case. In that case you can simply change calwt=True for the amplitude table applycal and re-run.
- Caveat 2: Most (all except early Cycle 0) ALMA manual calibration scripts have within them the CASA version used to create the script. For example:
if re.search('^4.3.1', casadef.casa_version) == None: sys.exit('ERROR: PLEASE USE THE SAME VERSION OF CASA THAT YOU USED FOR GENERATING THE SCRIPT: 4.3.1')
- You must change the version number to match the version you want to use or the script will not run.
- Caveat 3: Scripts from earlier than 4.2.1 are likely to have commands that make them incompatible to run directly in later versions of CASA. It may be difficult for a non-expert to update the script to current syntax.
Option 2: Run the task statwt on your calibrated science target data
This task attempts to assess the sensitivity per visibility and adjust the weights accordingly. It is very commonly used for JVLA data (including their pipeline).
- Caveat 1: One must limit the calculation to line-free channels. For complex line projects this can be painful, however, typically the line-free channels are already known from the continuum subtraction, and can be reused here. However, it is best run statwt before continuum subtraction.
Option 3: Make an approximate overall correction:
Using the task listobs, it is easy to check whether the datasets you wish to combine have different channel widths or visibility integration times (Δνch or tij).
Example of Correcting Weights from CASA 4.2.1 Data
<figure id="12m_WT.png">
</figure>
<figure id="7m_WT_old.png">
</figure>
Below we show what could have been done to correct the M100 SV data if it was reduced in CASA 4.2.1 (rather than 4.3 as demonstrated in https://casaguides.nrao.edu/index.php?title=M100_Band3_Combine_4.3).
In CASA 4.2.1 and earlier, the data weights are 1 upon import, later in the standard calibration procedure, applycal scales the weights by 1/[(Tsys(i) * Tsys(j)] if calwt=True for the Tsys table applycal. As an example, we plot the weights of 7m and 12m data imported in CASA 4.2.1. No averaging can be turned on when plotting the weights.
# In CASA
os.system('rm -rf 7m_WT.png 12m_WT.png')
plotms(vis='m100_12m_CO.ms',yaxis='wt',xaxis='uvdist',spw='0~2:200',
coloraxis='spw',plotfile='12m_WT.png')
#
plotms(vis='m100_7m_CO.ms',yaxis='wt',xaxis='uvdist',spw='0~2:200',
coloraxis='spw',plotfile='7m_WT.png')
As you can see from these plots, the weights are quite similar at this stage because the data were taken under similar weather conditions and hence Tsys. Additionally, for these data calwt=False was used to apply the antenna-based amplitude gains (see next section).
Recall that the rms noise in a single channel for a single visibility is:
[math]\displaystyle{ \sigma_{ij}=\frac{2k}{A_{eff}} }[/math] [math]\displaystyle{ \sqrt{\frac{T_{sys,i} T_{sys,j}}{\Delta\nu_{ch} t_{ij}}} }[/math]
<figure id="Intcombo_0.193_WT.png">
</figure>
Where k is Boltzmann's constant, Aeff is the effective antenna area, Tsys,i is the system temperature for antenna i, Δνch is the channel width, and tij is the integration time per visibility.
The two key things that are different between the 7m and 12m-array data are that the effective dish Areas are different by (7/12)2 and the integration times are different by sqrt(10.1/6.05). Since dish area is in the numerator of the radiometer equation and integration time per visibility is in the denominator, and assuming WT propto 1/sigma2, the 7m weight should be scaled by: (7./12.)4 x (10.1/6.05) = 0.193 to account for the difference in telescope size and integration time per visibility.
# In CASA
# Concat and scale weights
os.system('rm -rf M100_Intcombo_0.193.ms')
concat(vis=['m100_12m_CO.ms','m100_7m_CO.ms'],
concatvis='M100_Intcombo_0.193.ms',
visweightscale=[1,0.193])
Now plot the concatenated weights to verify they are as expected.
# In CASA
os.system('rm -rf Intcombo_0.193_WT.png')
plotms(vis='M100_Intcombo_0.193.ms',yaxis='wt',xaxis='uvdist',spw='0~2:200',
coloraxis='spw',plotfile='Intcombo_0.193_WT.png')