Difference between revisions of "User talk:Preshanthj"

From CASA Guides
Jump to navigationJump to search
Line 42: Line 42:
importevla(asdm= 'TSUB0001.sb28588128.eb28590840.56647.13284232639',  vis = 'TSUB0001.ms',savecmds=T, outfile='importflags.txt')
importevla(asdm= 'TSUB0001.sb28588128.eb28590840.56647.13284232639',  vis = 'TSUB0001.ms',savecmds=T, outfile='importflags.txt')
If some of these already exist, you can use the following commands to remove them:
<source lang='python'>
if os.path.isdir( ms_path ):
  rmtables( ms_path )
if os.path.isfile( online_flags_file ):
  os.remove( online_flags_file )
if os.path.isdir( ms_path + '.flagversions' ):
  shutil.rmtree( ms_path + '.flagversions', ignore_errors = True )
Now we use the <i>importevla()</i> task to convert the SDM-BDF data to a CASA measurement set (ms). This may take a few minutes:
<source lang='python'>
importevla( asdm = sdm_path, vis = ms_path, savecmds = True, outfile = online_flags_file )

Revision as of 20:03, 16 June 2015

Back to CASA guides

By Huib Intema & Preshanth Jagannathan

Last checked on CASA Version 4.3.1.


This webpage provides a basic description on how to use CASA to reduce data from the upper part of the new VLA low-band system, also known as P-band, covering roughly 220-480 MHz. The goal is to make a wide-field continuum stokes I image of a typical blank field using the full effective bandwidth.

Obtaining the raw data

For this guide we'll use some test data that was taken in B-configuration. To get a copy of the raw data, go to the NRAO archive query page, and enter the following search parameters (leave the rest on default):

Telescopes: Jansky VLA (tickbox)
Project Code: TSUB0001
Telescope Config: B (tickbox)
Observing Bands: P (tickbox)

The search should return (at least) 19 rows of results. Enter your valid e-mail address, set the download format to SDM-BDF, check the box on the 17th row (data from Dec. 21, 2013, or "13-Dec-21 03:11:18") at the bottom of the page, and click Get My Data. On the next page, click Retrieve over internet. The next page should report that the data staging is in progress. Wait until you receive an e-mail reporting that your archive data is copied, which should take a few minutes. Download the data onto your computer running CASA.

Starting CASA

Start CASA by typing


on the (linux) command line. This should start a CASA interactive python (iPython) session, and open a separate log window. The CASA version is reported at startup, both in the python session and the log window:

CASA Version 4.3.1-REL (r32491)

Note that this guide has been written for CASA release 4.3.1, which is the current stable release at NRAO. Due to ongoing development of CASA, this tutorial may or may not work with other versions. You may want to confirm your version before proceeding.

Importing the raw data into CASA

We will begin by importing our data into the measurement sent set format (CASA standard) from the binary format (SDM-BDF) as downloaded from the archive. We do this by means of the importevla task.

importevla(asdm= 'TSUB0001.sb28588128.eb28590840.56647.13284232639',  vis = 'TSUB0001.ms',savecmds=T, outfile='importflags.txt')

This task calls the external tool asdm2MS to perform the conversion. Together with the visibility data, also the flag commands from the VLA online system are imported. These flags are not directly applied, but stored in the TSUB0001_20131221.flags file and applied later.

Preliminary data inspection

listobs_file = project_name + '.listobs'
if os.path.isfile( listobs_file ):
  os.remove( listobs_file )

listobs( vis = ms_path, verbose = True, listfile = listobs_file )

You can inspect the output text file TSUB0001_20131221.listobs in your favorite text editor (e.g., gedit). When taking some time to familiarize yourself with the output format, you will see that:

  • The observation consists of 4 scans, namely [1] 3C48D (hardware setup), [2] 3C48 (primary calibrator), [3] 0313-192 (target), and [4] again 3C48 (primary calibrator).
  • The frequency coverage is 224 - 480 MHz, divided into 16 x 16 MHz spectral windows, each having 128 x 0.125 MHz channels
  • Visibilities are recorded every 2 seconds in full polarization (4 polarization products, (wrongly!) labelled RR,RL,LR,LL).
  • Of the 28 VLA antennas labelled ea01 - ea028, antennas ea06 and ea17 are not participating in this observation.
  • Source, antenna and spectral window IDs start at zero, but scan IDs start at one.

Alternatively you can also output to the logger by running the command without the listfile argument

listobs( vis =  )
           MeasurementSet Name:  /lustre/pjaganna/evla/P_Band/casa_guide/pband.ms      MS Version 2
   Observer: Frazer Owen     Project: uid://evla/pdb/1695465  
Observation: EVLA(26 antennas)
Data records: 5382000       Total elapsed time = 2076 seconds
   Observed from   21-Dec-2013/03:11:20.0   to   21-Dec-2013/03:45:56.0 (UTC)

Fields: 3
  ID   Code Name                RA               Decl           Epoch   SrcId      nRows
  0    NONE 3C48D               01:37:41.299431 + J2000   0         884000
  1    NONE 3C48                01:37:41.299431 + J2000   1        2168400
  2    NONE 0313-192            03:15:52.039999 - J2000   2        2329600
Spectral Windows:  (16 unique spectral windows and 1 unique polarization setups)
  SpwID  Name           #Chans   Frame   Ch0(MHz)  ChanWid(kHz)  TotBW(kHz) CtrFreq(MHz) BBC Num  Corrs          
  0      EVLA_P#A0C0#0     128   TOPO     222.000       125.000     16000.0    229.9375       12  RR  RL  LR  LL
  1      EVLA_P#A0C0#1     128   TOPO     238.000       125.000     16000.0    245.9375       12  RR  RL  LR  LL
  2      EVLA_P#A0C0#2     128   TOPO     254.000       125.000     16000.0    261.9375       12  RR  RL  LR  LL
  3      EVLA_P#A0C0#3     128   TOPO     270.000       125.000     16000.0    277.9375       12  RR  RL  LR  LL
  4      EVLA_P#A0C0#4     128   TOPO     286.000       125.000     16000.0    293.9375       12  RR  RL  LR  LL
  5      EVLA_P#A0C0#5     128   TOPO     302.000       125.000     16000.0    309.9375       12  RR  RL  LR  LL
  6      EVLA_P#A0C0#6     128   TOPO     318.000       125.000     16000.0    325.9375       12  RR  RL  LR  LL
  7      EVLA_P#A0C0#7     128   TOPO     334.000       125.000     16000.0    341.9375       12  RR  RL  LR  LL
  8      EVLA_P#A0C0#8     128   TOPO     350.000       125.000     16000.0    357.9375       12  RR  RL  LR  LL
  9      EVLA_P#A0C0#9     128   TOPO     366.000       125.000     16000.0    373.9375       12  RR  RL  LR  LL
  10     EVLA_P#A0C0#10    128   TOPO     382.000       125.000     16000.0    389.9375       12  RR  RL  LR  LL
  11     EVLA_P#A0C0#11    128   TOPO     398.000       125.000     16000.0    405.9375       12  RR  RL  LR  LL
  12     EVLA_P#A0C0#12    128   TOPO     414.000       125.000     16000.0    421.9375       12  RR  RL  LR  LL
  13     EVLA_P#A0C0#13    128   TOPO     430.000       125.000     16000.0    437.9375       12  RR  RL  LR  LL
  14     EVLA_P#A0C0#14    128   TOPO     446.000       125.000     16000.0    453.9375       12  RR  RL  LR  LL
  15     EVLA_P#A0C0#15    128   TOPO     462.000       125.000     16000.0    469.9375       12  RR  RL  LR  LL
Antennas: 26 'name'='station' 
   ID=   0-3: 'ea01'='N32', 'ea02'='N28', 'ea03'='E28', 'ea04'='E24', 
   ID=   4-7: 'ea05'='W08', 'ea07'='N12', 'ea08'='N16', 'ea09'='W28', 
   ID=  8-11: 'ea10'='E04', 'ea11'='W24', 'ea12'='E08', 'ea13'='W20', 
   ID= 12-15: 'ea14'='N24', 'ea15'='E20', 'ea16'='N08', 'ea18'='E36', 
   ID= 16-19: 'ea19'='W16', 'ea20'='N04', 'ea21'='E12', 'ea22'='N20', 
   ID= 20-23: 'ea23'='W36', 'ea24'='W04', 'ea25'='W32', 'ea26'='W12', 
   ID= 24-25: 'ea27'='N36', 'ea28'='E16'

The array configuration can be inspected using:

plotants_file = project_name + '.plotants.pdf'
if os.path.isfile( plotants_file ):
  os.remove( plotants_file )

plotants( vis = ms_path, figfile = plotants_file )

The figure in the plot window is also available as a PDF file named TSUB0001_20131221.plotants.pdf

As noted above, the polarizations of P-band in this data set are wrongly labelled as circular rather than linear. This may seem mostly harmless, but does make a difference for polarization calibration (which we will do later). The following (non-standard) task will check and fix this (and another related problem; see appendix on P-band data issues):

execfile( '/lustre/hintema/cspam/mytasks.py' )

fixlowband( vis = ms_path )

If desired, you can re-run the listobs() task to check if the fix was correctly applied.

Addendum: If you're not working on the NRAO DSOC network, you won't have access to the lustre file system. In that case, download this tarball in your CASA working directory, open a shell there, and type the following commands:

tar xzvf casa_vla_lowband.tar.gz


Now go back to your CASA session and type:

execfile( 'mytasks.py' )

fixlowband( vis = ms_path )

Initial processing steps

When importing the data, we saved the flags from the online system to an ASCII text file. This gives us the opportunity to review the flag commands before applying them. For our data set, the flag file is called TSUB0001_20131221.flags. Please load this file into your favorite text editor. The bulk of the flag commands refer to times when the VLA is slewing (ANTENNA_NOT_ON_SOURCE) or when the movable secondary reflector of the VLA's Cassegrain system is not in place (SUBREFLECTOR_ERROR). The latter can cause antenna gain variations (amplitude and phase), so it is safest to apply all the flags. The final two rows in the flag file will remove visibilities that are pure zero, and flag antennas that are partly blocked by other antennas (shadowing; this occurs mostly in compact configurations when observing along a VLA arm).

flagdata( vis = ms_path, mode = 'list', inpfile = online_flags_file, action = 'apply', reason = 'any', flagbackup = False )

The next step will correct the visibility amplitudes for the signal leveling that occurs at the inputs of the WIDAR correlator, the so-called requantizer gains. These levels (per antenna, per polarization, per spectral window) are stored with the measurement set (in the SYSPOWER sub-table). Currently, this step is not essential, since the levels get set only once at the start of an observation, and bandpass calibration will correct for this. But it will make your bandpass plots look better if you have multiple spectral windows. And, more importantly, in the future it may be that the levels will become time-variable during an observation, so we'd better be prepared.

The correction is done in two steps. First, the information in the SYSPOWER sub-table gets translated into a gain table. Second, the visibility data in the DATA column of the measurement set gets copied into a new CORRECTED_DATA column, and is corrected with the gain table. Note that this operation roughly doubles the size of the measurement set with respect to the original. This may take a while.

rq_gain_table = ms_path + '.Grq'

if os.path.isdir( rq_gain_table ):
  rmtables( rq_gain_table )

gencal( vis = ms_path, caltable = rq_gain_table, caltype = 'rq' )

applycal( vis = ms_path, field = '*', gaintable = [ rq_gain_table ], interp = [ 'nearest,nearestflag' ], applymode = 'calflagstrict', flagbackup = False )

If your observation is part of a multi-band observation, this is the time to isolate the P-band part. Although these data contain only P-band, this step is still useful because it will make the requantizer gain corrections permanent, and drop time ranges that are completely flagged (like the slew periods). The following call to split() will create a new measurement set with just P-band data. P-band data is selected by its spectral window range, which can be retrieved from the listobs() output:

pband_spws = '0~15'
pband_ms_path = project_name + '_PBAND.ms'

if os.path.isdir( pband_ms_path ):
  rmtables( pband_ms_path )

if os.path.isdir( pband_ms_path + '.flagversions' ):
  shutil.rmtree( pband_ms_path + '.flagversions', ignore_errors = True )

split( vis = ms_path, outputvis = pband_ms_path, field = '*', spw = pband_spws, datacolumn = 'corrected', keepflags = False )

Next, we will apply Hanning smoothing to suppress Gibbs ringing around bright RFI lines. Visibilities are overwritten; no new measurement set is created.

hanningsmooth( vis = pband_ms_path, datacolumn = 'data' )

NOTE: The experimental task mstransform() combines split() and hanningsmooth() and several other data manipulation operations, but its output is not yet to be trusted for all operations.

Dead and swapped antennas

Now it is time to have a first visual look at the uncalibrated visibility data. It is important to identify dead antennas / polarizations early on, so we can exclude them from further data processing (which is the most efficient data 'reduction' :-) ). A convenient way of doing this is through the flagdata() task on bright calibrator(s) (again, see the listobs() output), in our case 3C48.

bcal_field = '3C48'
flagdata( vis = pband_ms_path, field = bcal_field, mode = 'tfcrop', datacolumn = 'data', timecutoff = 4., freqcutoff = 4., maxnpieces = 5,
  action = 'calculate', display = 'both', flagbackup = False )

This will bring forward a data inspection window, with (per scan, per baseline, per spectral window) the top row a set of intensity plots of the visibility amplitudes in all polarization products (XX, YY, XY, and YX) before flagging, and the bottom row the same data after applying some outlier flagging (marked in blue). The intensity plots do scale automatically to min-max, so any bright RFI pixel will skew the intensity range (hence the difference between the top and bottom row).

In the case of 16 x 16 MHz spectral windows, we want to initially inspect a more central spectral window (spw 5 through 7) than the default one at the low edge of the band (spw = 0). Notice when clicking Next SPW a few times that several spectral windows are heavily affected by RFI, for instance spw 2 and 3, and 8 and 9. Also notice that not all RFI is captured; this is mainly because within a spectral window there is too little contrast between healthy and affected data. We will resolve that later.

A healthy spectral window will have little flagged data, and show a clear difference between the parallel-hand polarizations XX,YY (bottom row left two panels) and the cross-hand polarizations XY,YX (bottom row right two panels), where the former shows a time-constant, smooth spectral behaviour, while the latter shows a noise-like behaviour. Cycling through the baselines (Previous Baseline / Next Baseline buttons) please note the following:

  • Baselines containing antenna ea07 have a ripple in XX.
  • Baselines containing antennas ea10 or ea13 contain only noise in all polarization products.
  • Baselines containing antennas ea25 contain noise in XX and YY, signal in XY, and noise in YX
  • Baselines containing antenna ea28 have a ripple in YY

What this means is that antennas ea10 and ea13 are dead and need to be flagged. We will also take this opportunity to flag the dummy scan 1 on '3C48D'. Antenna ea25 has swapped polarizations and a dead Y-polarization; for demonstration purposes we will unswap this antenna, but it becomes useless later on since CASA cannot form proper stokes I visibilities (which is different behaviour than in some other reduction software, e.g,. AIPS). The ripples in antennas ea07 (X-polarization) and ea28 (Y-polarization) will show up, and calibrate out, in the bandpass calibration performed below.

If we're done inspecting the data in flagdata(), press the Quit button (and not Stop Display). The next commands will perform the necessary flag operations, and unswap antenna ea25.

dead_antennas = 'ea10,ea13'
flagdata( vis = pband_ms_path, mode = 'manual', antenna = dead_antennas, flagbackup = False )

dummy_scan = '1'
flagdata( vis = pband_ms_path, mode = 'manual', scan = dummy_scan, flagbackup = False )

swapped_antennas = 'ea25'
swappol( vis = pband_ms_path, antenna = swapped_antennas )

NOTE: swappol() is not part of the default CASA release, but was imported together with fixlowband(). It is a pure python implementation operating on visibilities, therefore it may take a while to finish. The way in which we run swappol() we overwrite the previous visibilities, so no new measurement set is created.

You can run flagdata() again in calculate mode to inspect the results.

Automatic flagging

Before throwing away bad spectral windows by hand, we'll let the automated flaggers in CASA have a go at it. The main task for this is flagdata(). While working on your data, flagdata() produces an abundance of output, not all of which I find easy to understand. To help with this, we can get the flag status of our data before and after auto-flagging by running flagdata() in the summary mode. This first call provides the flag statistics of the flagging up to now, before any auto-flagging:

summary_1 = flagdata( vis = pband_ms_path, mode = 'summary' )

This returns a python dictionary with flagged versus total visibilities along various axes (antenna, scan, spw, field, correlation, etc.). For example, if we want to know the percentage flagged per scan, run the following (note that the scans may not appear in sorted order):

axis = 'scan'
for id, stats in summary_1[ axis ].iteritems():
  print '%s %s: %5.1f percent flagged' % ( axis, id, 100. * stats[ 'flagged' ] / stats[ 'total' ] )

For this example, you will notice that scan 1 (the dummy scan on 3C48D) is 100 percent flagged, which is what we did during the initial flagging.

We will run flagdata() in the tfcrop mode, which will (per scan, per baseline, per spectral window, per polarization) look for visibility amplitude outliers. It uses a 5-piece polynomial in an attempt to remove any intrinsic bandpass (amplitude) structure (we did not calibrate for bandpass yet). We run the task twice to allow for slightly deeper flagging. For the first run, we tell CASA to make a backup of our visibility flag status up to here, giving us an option to restore them if we choose over-aggresive flagging parameters and consequently over-flag our data. The flag backup file name is reported in the log window, and can be found in the TSUB0001_20131221_PBAND.ms.flagversions directory.

flagdata( vis = pband_ms_path, field = '*', mode = 'tfcrop', datacolumn = 'data', timecutoff = 5., freqcutoff = 5., maxnpieces = 5,
  action = 'apply', display = 'report', flagbackup = True )
flagdata( vis = pband_ms_path, field = '*', mode = 'tfcrop', datacolumn = 'data', timecutoff = 5., freqcutoff = 5., maxnpieces = 5,
  action = 'apply', display = 'report', flagbackup = False )

Let's get another flag summary to see how much extra data got flagged:

summary_2 = flagdata( vis = pband_ms_path, mode = 'summary' )

axis = 'scan'
for value, stats in summary_2[ axis ].iteritems():
  old_stats = summary_1[ axis ][ value ]
  print '%s %s: %5.1f percent flagged additionally' % ( axis, value, 100. * ( stats[ 'flagged' ] - old_stats[ 'flagged' ] ) / stats[ 'total' ] )

There is quite a bit of RFI that gets missed in RFI-rich spectral windows. The solution to this is to provide the flagging routines with more contrast between healthy and affected data. For this, we will combine the 16 spectral windows into one, and perform a preliminary bandpass calibration to take out the bandpass edge effects between spectral windows, and other intrinsic bandpass structure. Combining spectral windows can only be done with the experimental task mstransform(), which seems to work ok for this type of operation. A new measurement set will be created.

onespw_ms_path = project_name + '_PBAND_1SPW.ms'
if os.path.isdir( pband_ms_path ):
  rmtables( onespw_ms_path )

mstransform( vis = pband_ms_path, outputvis = onespw_ms_path, datacolumn = 'data', combinespws = True )

Before we do more flagging, we will approximately remove the frequency structure across the observing band that is induced by our instrument (the VLA). This bandpass calibration, one per antenna and polarization, is derived using the primary calibrator 3C48. In the measurement set, all (uncalibrated) visibility data is located in a column named DATA. To perform any kind of calibration, we need to create and fill a MODEL_DATA column that matches the dimensions of the DATA column. Logically, the MODEL_DATA column contains the visibilities corresponding to whatever radio source model we choose to calibrate against. For our bandpass calibration against 3C48, we assume an unpolarized point source model with a frequency-dependent flux density as described by the Scaife & Heald (2012) low-frequency flux scale. Note that this operation roughly doubles the size of the measurement set.

fluxscale = setjy( vis = onespw_ms_path, field = bcal_field, scalebychan = True, standard = 'Scaife-Heald 2012', usescratch = True )

There will probably be some warnings about a time-dependent feed table in this and subsequent CASA task output logs. This is caused by a problem in the FEED sub-table created by mstransform(), but can be ignored. The logger reports the stokes I flux of 3C48, which should be in the range of 50-60 Jy. For P-band you really need calibrators of this magnitude. Any calibrator will be competing with typically 40-50 Jy of total apparent flux of other sources within the same wide field-of-view. Experience shows that observing a 20-30 Jy source like 3C286 for a few minutes is insufficient to derive good-quality bandpass calibrations. Better options are 3C48 and 3C147.

We set the minimum signal-to-noise (SNR) requirement to 0.1 to allow for as many solutions as possible, even for channels that have much RFI. Our aim here is not to create a clean bandpass calibration, but to have a first-order correction to perform more flagging.

bcal_field = '3C48'
bcal_scan = '4'
reference_antenna = 'ea01'
bandpass_table_v0 = onespw_ms_path + '.B0'

if os.path.isdir( bandpass_table_v0 ):
  rmtables( bandpass_table_v0 )

bandpass( vis = onespw_ms_path, caltable = bandpass_table_v0, field = bcal_field, selectdata = True, refant = reference_antenna,
  solint = 'inf', combine = 'scan,field', minsnr = 0.1, solnorm = False, parang = True )

The task bandpass() will report failed solutions for several channels, which is due to our previous flagging and because of heavily RFI-affected channels. The resulting bandpass calibration table can be inspected using plotcal(). It may be necessary to adjust the plotrange parameters ( min_freq_MHz, max_freq_MHz, min_amp, max_amp ) to get an optimal view for all antennas.

plotcal( caltable = bandpass_table_v0, xaxis = 'freq', yaxis = 'amp', iteration = 'antenna', plotrange = [ 200.,500.,0.,0.1 ], markersize = 2.0 )

Each plot shows the bandpass amplitudes for both polarizations X and Y. The Mark Region and Locate buttons can be used to identify the polarization, pol = 0 being X and pol = 1 being Y. You will notice that each antenna and polarization has a different overall gain level. Spectral windows are recognizable by the depressions in the bandpass plots, where spw = 2,9,13 are particularly affected by RFI. Several antennas / polarizations show significant ripples across frequency, which are mostly caused by cable connector problems (e.g., see antenna ea07). More information is provided in the Appendix below. As expected, antenna ea25 has one polarization that has a very low-amplitude bandpass plot.

summary_3 = flagdata( vis = onespw_ms_path, mode = 'summary' )

axis = 'scan'
for id, stats in summary_3[ axis ].iteritems():
  print '%s %s: %5.1f percent flagged' % ( axis, id, 100. * stats[ 'flagged' ] / stats[ 'total' ] )

Although we could, we are not going to flag any bad bandpass calibration solutions in =plotcal()= (we'll do that later). The bandpass table, with all its obvious problems, is temporarily applied to the data to facilitate more flagging. This means that the visibility data (of all sources) in the DATA column gets corrected with the bandpass calibration table and copied into the CORRECTED_DATA column. Channels for which no solution was found will be flagged. Note that this operation, together with the MODEL_DATA column created by setjy(), roughly triples the size of the measurement set with respect to the original. This may take a while.

applycal( vis = onespw_ms_path, field = '*', gaintable = [ bandpass_table_v0 ], interp = [ 'nearest,nearestflag' ],
  applymode = 'calflagstrict', flagbackup = True )

For convenience, we have made a backup of the flagged data status. Now that we have flattened out most of the bandpass structure, we run flagdata() in the rflag mode on the corrected data, which performs a more rigorous outlier detection. First we check the results before applying them. Remember that the data now only has one spectral window.

flagdata( vis = onespw_ms_path, field = '*', mode = 'rflag', datacolumn = 'corrected', winsize = 5,
  action = 'calculate', display = 'both', flagbackup = False )

You can see that most of antenna ea25 gets flagged. This is likely because much of the dead polarization data gets flagged, and these flags get extended to the other polarization products. The task =flagdata()= in =rflag= mode does a good job in catching most of the obvious RFI without over-flagging, so we'll now run it twice to apply the flags.

flagdata( vis = onespw_ms_path, field = '*', mode = 'rflag', datacolumn = 'corrected', winsize = 5,
  action = 'apply', display = 'report', flagbackup = False )
flagdata( vis = onespw_ms_path, field = '*', mode = 'rflag', datacolumn = 'corrected', winsize = 5,
  action = 'apply', display = 'report', flagbackup = False )

By getting the new flagging statistics, we can indeed confirm that ea25 is now also completely flagged.

summary_4 = flagdata( vis = onespw_ms_path, mode = 'summary' )

axis = 'antenna'
for value, stats in summary_4[ axis ].iteritems():
  old_stats = summary_3[ axis ][ value ]
  print '%s %s: %5.1f percent flagged' % ( axis, value, 100. * ( stats[ 'flagged' ] ) / stats[ 'total' ] )

axis = 'scan'
for value, stats in summary_4[ axis ].iteritems():
  old_stats = summary_3[ axis ][ value ]
  print '%s %s: %5.1f percent flagged additionally' % ( axis, value, 100. * ( stats[ 'flagged' ] - old_stats[ 'flagged' ] ) / stats[ 'total' ] )

Now we have removed most of the RFI, we will start the actual calibration of the data. We go back to our original data in the DATA column by removing the the CORRECTED_DATA column. For this, we use the table toolkit.

tb.open( onespw_ms_path, nomodify = False )
tb.removecols( 'CORRECTED_DATA' )


Delay and bandpass calibration

The initial flagging has cleaned the data of most of the RFI. This allows us to proceed with deriving the necessary calibrations. First we use the full bandwidth on a single scan on the primary calibrator to determine a single delay per antenna per polarization. This will determine a single (approximate) phase slope across frequency, mainly caused by propagation effects in the (time-variable) ionosphere, cable length differences and electronics in the signal paths from antenna feeds to correlator. Note that for a short (5-10 minutes), single scan on the calibrator we can get away with solving for a time-invariant delay per antenna per polarization. We will again use scan 4, which is the longest scan on 3C48.

bcal_field = '3C48'
bcal_scan = '4'
reference_antenna = 'ea01'
delay_table_v1 = onespw_ms_path + '.K1'

if os.path.isdir( delay_table_v1 ):
  rmtables( delay_table_v1 )

gaincal( vis = onespw_ms_path, caltable = delay_table_v1, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', refant = reference_antenna, minsnr = 3.0, gaintype = 'K', parang = True )

plotcal( caltable = delay_table_v1, xaxis = 'antenna', yaxis = 'delay', markersize = 2.0 )

Since we picked ea01 as our reference antenna, the delays for this antenna ID (=0) are arbitrarily set to zero. All other delays should be within 100 nanosec or so (which they should be for this data set). Larger values should be treated with suspicion, probably indicating a problem with the antenna / polarization. We will use the bandpass calibration (next) to verify this.

Next is to determine the bandpass calibration. We use the same source and scan, and apply the delay calibration before solving for the bandpass. Note that we request a minimum SNR of 3, which will make the solve fail for the worst channels, but not all.

bandpass_table_v1 = onespw_ms_path + '.B1'

if os.path.isdir( bandpass_table_v1 ):
  rmtables( bandpass_table_v1 )

bandpass( vis = onespw_ms_path, caltable = bandpass_table_v1, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', refant = reference_antenna, minsnr = 3.0, solnorm = False, parang = True,
  gaintable = [ delay_table_v1 ], interp = [ 'nearest,nearestflag' ] )

After solving, we inspect the bandpass calibration amplitudes and phases, and flag any obvious residual outliers. Outliers may be found above or below the average bandpass curves, and tend to arise in the same channel (frequency) ranges for all antennas. Don't spend more than 1 minute per antenna. And don't worry about missing some bad points; these bad channels will likely get flagged in a later stage anyway. Note that the bandpass amplitudes and phases may differ between polarizations of the same antenna. Also note that the bandpass phases across frequency should not have an overall gradient (this was removed by the delay calibration). Also also note that bandpass phases may wrap around from +/-180 to -/+180 degrees. If that is problematic, re-run the phase plotcal() with plotrange = [ 200.,500.,0.,360. ]. Also note that there are no bandpass solutions for antennas ea10, ea13 and ea25.

plotcal( caltable = bandpass_table_v1, xaxis = 'freq', yaxis = 'amp', iteration = 'antenna', markersize = 2.0 )

plotcal( caltable = bandpass_table_v1, xaxis = 'freq', yaxis = 'phase', iteration = 'antenna', plotrange = [ 200.,500.,-180.,180. ], markersize = 2.0 )

Flagging is made easier by maximizing the plotcal() window. We don't set the scale for plotting the bandpass amplitudes, which is convenient when flagging outliers as it will auto-rescale. While plotting the bandpass phases we do fix the scale, as it provides a better feeling for the relevant magnitude of phase outliers. Flagging of outliers is done using the Mark Region and Flag buttons. We are using this mechanism to efficiently flag data manually, since it is done per antenna rather than per baseline. These 'flags' will not be permanent until we call applycal() later on, but are effective while applying the calibration on the fly, as we will do in subsequent calibration steps.

Gain calibration

With the delay and bandpass calibration in place, we will now look more closely at the time-variable behaviour of the VLA. We use the same scan on our primary calibrator 3C48 over the full effective bandwidth to determine gain calibrations: one complex value per antenna per polarization per integration time. The delay and bandpass tables are applied on the fly.

gain_table_v1 = onespw_ms_path + '.G1'

if os.path.isdir( gain_table_v1 ):
  rmtables( gain_table_v1 )

gaincal( vis = onespw_ms_path, caltable = gain_table_v1, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'int', refant = reference_antenna, minsnr = 3.0, gaintype = 'G', calmode = 'ap',
  gaintable = [ delay_table_v1, bandpass_table_v1 ], interp = [ 'nearest,nearestflag', 'nearest,nearestflag' ], parang = True )

Similar to the bandpass flagging, we will flag outlier gain solutions in both amplitude and phase. The regular dips in the gain amplitudes every 4 minutes is caused by a very strong RFI signal from the local VLA site radio; this issue is being addressed. For the amplitudes, the final scatter around an average close to one should be a few percent. Note that antenna ea12 has quite a large bump in the gain amplitudes on one polarization. For this and other cases, remember that the gain solutions for both polarizations can be flagged together, as we will loose the equivalent data of the surviving polarization anyway.

plotcal( caltable = gain_table_v1, xaxis = 'time', yaxis = 'amp', iteration = 'antenna', markersize = 2.0 )

plotcal( caltable = gain_table_v1, xaxis = 'time', yaxis = 'phase', iteration = 'antenna', plotrange = [ None,None,-180.,180. ], markersize = 2.0 )

Note that when plotting gains against time, it is convenient to use auto-scaling of the time axis by putting None in the first two fields of plotrange.

Next, we will prepare a smoothed and interpolated version of the gain calibration table, which will be applied later on to the target field data. This prevents flagging of target field data when one of the edges of a calibrator scan is flagged for one or more antennas. The result can be inspected with plotcal.

smooth_gain_table_v1 = onespw_ms_path + '.Gs1'

if os.path.isdir( smooth_gain_table_v1 ):
  rmtables( smooth_gain_table_v1 )

smoothcal( vis = onespw_ms_path, tablein = gain_table_v1, caltable = smooth_gain_table_v1, smoothtype = 'median', smoothtime = 60. * 60. )

plotcal( caltable = smooth_gain_table_v1, xaxis = 'time', yaxis = 'amp', iteration = 'antenna', plotrange = [ None,None,0.8,1.2 ], markersize = 2.0 )

plotcal( caltable = smooth_gain_table_v1, xaxis = 'time', yaxis = 'phase', iteration = 'antenna', plotrange = [ None,None,-10.,10. ], markersize = 2.0 )

Instrumental polarization calibration

The VLA at P-band measures the incoming radio waves using dual-dipole feeds mounted near the primary focus of the dishes. While much effort is put into mechanically making these dual-dipoles orthogonal, and to strongly suppress any cross-talk between the electronic signal paths of both dipoles, there is some (frequency-dependent) polarization leakage between them. Furthermore, the P-band feeds are mounted and aligned on each antenna by hand, allowing for 5-10 degrees difference in the orientation of the dipoles between antennas. For both intensity and polarization imaging, it is important to calibrate the polarization products in our data to optimize the (i) orthogonality between X and Y for each antenna and (ii) the alignment between Xs and Ys of different antennas. Note that this does not (yet) include calibration of the absolute polarization angle, for which we need a linearly polarized source with known polarization angle.

First we calibrate the cross delay, which is the difference in delay between the two polarizations of the reference antenna, in our case ea01. Remember that the regular delay calibration arbitrarily set the delays for both polarizations of the reference antenna to zero. In reality, there will be a small but noticeable physical path length difference between these two polarizations, for which we try to correct. This calibration uses the same scan on 3C48, assuming that 3C48 is unpolarized at P-band (which is a good approximation).

cross_delay_table_v1 = onespw_ms_path + '.Kc1'

if os.path.isdir( cross_delay_table_v1 ):
  rmtables( cross_delay_table_v1 )

gaincal( vis = onespw_ms_path, caltable = cross_delay_table_v1, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', refant = reference_antenna, minsnr = 3.0, gaintype = 'KCROSS', parang = True,
  gaintable = [ delay_table_v1, bandpass_table_v1, gain_table_v1 ], interp = [ 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag' ] )

plotcal( caltable = cross_delay_table_v1, xaxis = 'antenna', yaxis = 'delay', markersize = 2.0 )

The plot shows a small ~2 nanosec delay between X and Y of the reference antenna. Note that the table is generated in such a way that this correction gets applied to all antennas.

Next, we determine the leakage of the X signal into Y, and Y into X. For a well-behaved instrument, the magnitude of the leakage (the so-called D-terms) should be much less than one. Since the leakage is frequency dependent, we solve for leakage for each frequency channel, somewhat similar to bandpass calibration. The way to do this in CASA is through the polcal() task with the option poltype = 'Df'. Again, we can inspect (and edit) the leakage calibration table with plotcal()

leakage_table_v1 = onespw_ms_path + '.Df1'

if os.path.isdir( leakage_table_v1 ):
  rmtables( leakage_table_v1 )

polcal( vis = onespw_ms_path, caltable = leakage_table_v1, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', preavg = 1., refant = reference_antenna, minsnr = 3.0, poltype = 'Df',
  gaintable = [ delay_table_v1, bandpass_table_v1, gain_table_v1, cross_delay_table_v1 ],
  interp = [ 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag' ] )

plotcal( caltable = leakage_table_v1, xaxis = 'freq', yaxis = 'amp', iteration = 'antenna', plotrange = [ 200.,500.,0.,1. ], markersize = 2.0 )

Before starting any flagging, it is best to first cycle past all antennas to get a feel for the magnitude of the leakage terms. For most antennas, both leakage terms are roughly equal in amplitude across the band, with values between 0 and 0.15. Few have higher leakages (up to 0.3). The most likely explanation is a larger misalignment of the P-band feed of these antennas relative to the others. Examples are ea05, ea11, and ea24. These antennas do not need to be flagged, but are merely expected to benefit most from this polarization calibration. Some examples of suspicious data that is probably best to flag: antenna ea03 has a peculiar bump in one of the leakage terms, ea12 has an upturn at the low-frequency end of the band, ea14 has a peak near the low-frequency end of the band.

It is better not to flag based on phases, because the closer the amplitude of the leakage terms get to zero, the more erratic the phase behaviour will be. An alternative is to flag based on the real and imaginary parts, which you may or may not want to do.

plotcal( caltable = leakage_table_v1, xaxis = 'freq', yaxis = 'real', iteration = 'antenna', plotrange = [ 200.,500.,-0.5,0.5 ], markersize = 2.0 )

plotcal( caltable = leakage_table_v1, xaxis = 'freq', yaxis = 'imag', iteration = 'antenna', plotrange = [ 200.,500.,-0.5,0.5 ], markersize = 2.0 )

Re-iterate bandpass and leakage calibration

The largest instrumental effects on our data are now captured in several relevant calibration tables. But since we have used uncalibrated / partially calibrated data to determine these calibration tables, there are improvements to be obtained when re-determining some calibration tables. In this guide we re-calibrate the bandpass and polarization leakage since these are expected to have changed most. Our re-calibrations will be relative to the existing bandpass and leakage calibrations, so that we can more easily see deviations from the ideal case.

First we update the bandpass calibration by determining a new, relative bandpass table while applying all previous calibrations. In the ideal case that our data and previous calibrations were perfect, the new bandpass table would contain gain amplitudes and phases of one and zero, respectively, for all (unflagged) antennas, polarizations and channels.

bandpass_table_v1a = onespw_ms_path + '.B1a'

if os.path.isdir( bandpass_table_v1a ):
  rmtables( bandpass_table_v1a )

bandpass( vis = onespw_ms_path, caltable = bandpass_table_v1a, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', refant = reference_antenna, minsnr = 3.0, solnorm = False, parang = True,
  gaintable = [ delay_table_v1, bandpass_table_v1, gain_table_v1, cross_delay_table_v1, leakage_table_v1 ],
  interp = [ 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag' ] )

plotcal( caltable = bandpass_table_v1a, xaxis = 'freq', yaxis = 'amp', iteration = 'antenna', plotrange = [ 200.,500.,0.,2. ], markersize = 2.0 )

plotcal( caltable = bandpass_table_v1a, xaxis = 'freq', yaxis = 'phase', iteration = 'antenna', plotrange = [ 200.,500.,-180.,180. ], markersize = 2.0 )

In the plotcal() calls above, the vertical axis scales have been chosen to indicate the relevance of the ideal case of unity amplitudes and zero phases. For most antennas, the bandpass solutions are close to ideal. Some antennas show some noticeable structure across frequency. A general guideline for flagging (or not) is that we expect the relative bandpass to be close to one in amplitude and vary smoothly in frequency.

leakage_table_v1a = onespw_ms_path + '.Df1a'

if os.path.isdir( leakage_table_v1a ):
  rmtables( leakage_table_v1a )

polcal( vis = onespw_ms_path, caltable = leakage_table_v1a, field = bcal_field, selectdata = True, scan = bcal_scan,
  solint = 'inf', preavg = 1., refant = reference_antenna, minsnr = 3.0, poltype = 'Df',
  gaintable = [ delay_table_v1, bandpass_table_v1, bandpass_table_v1a, gain_table_v1, cross_delay_table_v1, leakage_table_v1 ],
  interp = [ 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag' ] )

plotcal( caltable = leakage_table_v1a, xaxis = 'freq', yaxis = 'amp', iteration = 'antenna', plotrange = [ 200.,500.,None,None ], markersize = 2.0 )

Transfer of calibrations to the target field

So far, we have used one scan on the primary calibrator 3C48 to derive various calibration tables. These calibrations will now be applied to all scans of all sources in our measurement set, which includes 3C48 itself and our target field 0313-192. As described above, the task applycal() creates a CORRECTED_DATA column in which the calibrated visibilities get stored. This may take a while.

applycal( vis = onespw_ms_path, field = '*', selectdata = True, parang = True, applymode = 'calflagstrict', flagbackup = True,
    gaintable = [ delay_table_v1, bandpass_table_v1, bandpass_table_v1a, smoothed_gain_table_v1, cross_delay_table_v1, leak_table_v1, leak_table_v1a ],
    interp = [ 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag', 'nearest,nearestflag',
    'nearest,nearestflag', 'nearest,nearestflag' ] )

Now we split off the calibrated target field data, meaning that the visibilities of source 0313-192 get copied from the CORRECTED_DATA column to the DATA column of a new measurement set. This is convenient for further processing.

target_name = '0313-192'
target_ms_path = target_name + '_PBAND_1SPW.ms'

if os.path.isdir( target_ms_path ):
  rmtables( target_ms_path )

split( vis = onespw_ms_path, outputvis = target_ms_path, datacolumn = 'corrected', field = target_name, keepflags = False )

Cross-polarization flagging

All flagging up to this point was done on the calibrator 3C48, and some of these flags got propagated to the target field data. we will now extend the flagging of the target field data by flagging based on the cross-polarization visibilities XY and YX. Most RFI is strongly polarized, and will stand out clearly against the typically very weakly polarized target field.

flagdata( vis = pband_ms_path_1spw, mode = 'rflag', field = '*', correlation = 'XY,YX', datacolumn = 'corrected', winsize = 5,
  action = 'apply', display = 'report', flagbackup = False )

At this point, we're ready to make a first image of our target field, after which we will self-calibrate and re-image. [...]

Appendix: Some P-band data issues you may want to know about

Unfortunately, there are a few known issues with JVLA P-band observing. Most problems were discovered during the P-band commissioning period and have been fixed for newer data sets, but the archive keeps part of this history alive. Here we will go over some of those issues, and (if possible) provide ways of fixing them.

Polarization labeling

For a long period (until recent), P-band feeds have been labelled as being circular (R and L), while they are linear (X and Y). A contributed task =fixlowband()= is available to recognize and fix this problem.

Swapped polarizations

Throughout the whole history of the new VLA low-band system, even up to this day, there have been mistakes in the cabling of the full signal chain. This results in that some antennas have the X-polarization signal come in as Y and vice versa. The way to notice this in visibility data is that for baselines with one swapped antenna most power will be in the cross-hand correlations (XY and YX) rather than the parallel-hand correlations (XX and YY). Once antennas with this feature are identified (manually), a contributed task swappol() is available to fix this problem.

Double data descriptor entries

The data description table is part of the measurement set, and provides a link between the recorded visibilities, the spectral window information and the polarization information. For some data sets, the data description table contains two entries for each P-band spectral window, one pointing towards a circular (RL) polarization definition, and one pointing towards a linear (XY) polarization definition. A contributed task fixlowband() is available to recognize and fix this problem.

Continuous Radio Frequency Interference (RFI)

The wide bandwidth of the low-band receiver is (unfortunately) guaranteed to contain significant amounts of RFI. There are a few RFI sources that are active all the time, and are visible in all array configurations. The default P-band observing setup of 16 x 16 MHz tries to capture as much of the continuous RFI in as few spectral windows as possible, allowing for a simple RFI mitigation strategy in which these spectral windows (spws 1,2 and 9,10, but possibly more) can be immediately flagged.

Two spectral window setup

In the early commissioning period, the default setup for P-band observing was to use 2 spectral windows of 1024 channels each to cover 256 MHz of bandwidth from 230 - 486 MHz. However, it was noticed that strong, narrow-band RFI events were causing data to be lost for the whole spectral window in which they occured. To make the system more robust against data loss from these events, the frequency range was divided into 16 x 16 MHz, and shifted downwards by 6 MHz to capture ever-present RFI into as few spectral windows as possible. If your data has only two P-band spectral windows, please be aware that a higher data loss due to RFI is possible. There is no way to repair this.

Bandpass ripples

Due to signal reflections in cables and within the VLA dish, sinusoidal amplitude (and phase) modulations are always present in P-band data. This is most easily seen in bandpass calibration plots of amplitude versus frequency. In some cases, mostly due to cable & connector problems, these modulations can be very strong (~up to 50 percent of the average amplitude level). These modulations tend not vary over the duration of an observation and can therefore be removed through bandpass (and polarization) calibration. If they are found to be variable (e.g., by inspecting bandpass solutions for separate calibrator scans), the offending antenna / polarization should be flagged.

High cross-polarization

On each antenna, the P-band feed (dipole) is visually aligned with respect to the primary focus support legs. This is normally done within 5-10 degrees accuracy. On rare occasions the dipole on one antenna has accidentally rotated to much larger angles because the locking bolt on the back of the feed was not completely tightened. The result is a high cross-polarization in the baseline visibilities that include this antenna. If this is the case, it is safest to flag this data, since there is a possibility that the dipole has rotated during the observations.

Useful links