EVLA Wide-Band Wide-Field Imaging: G55.7 3.4-CASA4.4

From CASA Guides
Revision as of 12:30, 3 November 2015 by Jott (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
  • This CASA Guide is designed for CASA 4.4.


This CASA Guide describes the imaging of the supernova remnant G55.7+3.4.. The data were taken on August 23, 2010, in the first D-configuration for which the new wide-band capabilities of the WIDAR correlator were available. The 8-hour-long observation includes all available 1 GHz of bandwidth in L-band, from 1-2 GHz in frequency.

Obtaining the data

A copy of the data can be downloaded here: http://casa.nrao.edu/Data/EVLA/G55/G55.7+3.4_10s.ms.tar.gz

Note that this dataset is rather large: ~15GB

As a start, unzip and untar the data:

tar -xzvf G55.7+3.4_10s.ms.tar.gz

This will take a minute, but once it's complete, you will have a directory called G55.7+3.4_10s.ms which is the data. Online flags have been applied (which delete known bad data), some uninteresting scans removed, and the data time-averaged to 10 seconds. (The data were taken in D-configuration, where maximum baselines are 1 km, so one can safely average to 3s or even 10s to reduce data set size.) This is equivalent to what you would download from the archive if you requested time-averaging, scans 16~313, and application of the online flags.

You can also find the dataset in the NRAO archive. Note that it is 170 GB in raw form.

Averaging to 10 seconds and the removal of some scans which are not used in this tutorial reduces the size of the data set to around 15 GB; the addition of columns for model and corrected data (known as "scratch columns") during calibration will ultimately inflate the MS by a factor of a few in size (to around 45 GB).

Start and confirm your version of CASA

Start CASA by typing casapy on the command line. If you have not used CASA before, some helpful tips are available on the Getting Started in CASA page.

This guide has been written for CASA release 4.4.0. Please confirm your version before proceeding by checking the message in the command line interface window or the CASA logger after startup.

Preliminary data evaluation

As a first step, use listobs to have a look at the MS:


Note that throughout this tutorial, we will run tasks using the task(parameter=value) syntax. When called in this manner, all parameters not explicitly set will use their default values.

The logger output will look like this:

##### Begin Task: listobs            #####
           MeasurementSet Name:  /lustre/lsjouwer/casaguides/guide_g55.7+3.4/G55.7+3.4_10s.ms      MS Version 2
   Observer: Dr. Sanjay Sanjay Bhatnagar     Project: T.B.D.  
Observation: EVLA
Data records: 7343848       Total elapsed time = 26697 seconds
   Observed from   23-Aug-2010/01:00:24.0   to   23-Aug-2010/08:25:21.0 (UTC)
   ObservationID = 0         ArrayID = 0
  Date        Timerange (UTC)          Scan  FldId FieldName             nRows     SpwIds   Average Interval(s)    ScanIntent
  23-Aug-2010/01:00:24.0 - 01:01:05.0    16      1 J1925+2106                8008  [0,1,2,3,4,5,6,7]  [9.64, 9.64, 9.64, 9.64, 9.64, 9.64, 9.64, 9.64] [CALIBRATE_PHASE.UNSPECIFIED]
              01:01:05.0 - 01:02:35.0    17      1 J1925+2106               25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [CALIBRATE_PHASE.UNSPECIFIED]
              01:02:35.0 - 01:04:05.0    18      1 J1925+2106               25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [CALIBRATE_PHASE.UNSPECIFIED]
              01:04:05.0 - 01:05:34.0    19      1 J1925+2106               25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [CALIBRATE_PHASE.UNSPECIFIED]
              01:05:34.0 - 01:07:04.0    20      1 J1925+2106               25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [CALIBRATE_PHASE.UNSPECIFIED]
              01:07:10.0 - 01:08:34.0    21      2 G55.7+3.4                25064  [0,1,2,3,4,5,6,7]  [9.38, 9.38, 9.38, 9.38, 9.38, 9.38, 9.38, 9.38] [OBSERVE_TARGET.UNSPECIFIED]
              01:08:34.0 - 01:10:04.0    22      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:10:04.0 - 01:11:34.0    23      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:11:34.0 - 01:13:03.0    24      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [OBSERVE_TARGET.UNSPECIFIED]
              01:13:03.0 - 01:14:33.0    25      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:14:33.0 - 01:16:03.0    26      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:16:03.0 - 01:17:33.0    27      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:17:33.0 - 01:19:02.0    28      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [OBSERVE_TARGET.UNSPECIFIED]
              01:19:02.0 - 01:20:32.0    29      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              01:20:32.0 - 01:22:02.0    30      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              08:04:26.0 - 08:05:55.0   300      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [OBSERVE_TARGET.UNSPECIFIED]
              08:05:55.0 - 08:07:25.0   301      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              08:07:25.0 - 08:08:55.0   302      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              08:08:55.0 - 08:10:25.0   303      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              08:10:25.0 - 08:11:54.0   304      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [OBSERVE_TARGET.UNSPECIFIED]
              08:11:54.0 - 08:13:24.0   305      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [OBSERVE_TARGET.UNSPECIFIED]
              08:13:24.0 - 08:14:53.0   306      2 G55.7+3.4                25272  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [OBSERVE_TARGET.UNSPECIFIED]
              08:17:47.0 - 08:17:54.0   308      3 0542+498=3C147              80  [0,1,2,3,4,5,6,7]  [7, 7, 7, 7, 7, 7, 7, 7] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
              08:17:54.0 - 08:19:23.0   309      3 0542+498=3C147           17152  [0,1,2,3,4,5,6,7]  [9.88, 9.88, 9.88, 9.88, 9.88, 9.88, 9.88, 9.88] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
              08:19:23.0 - 08:20:53.0   310      3 0542+498=3C147           18216  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
              08:20:53.0 - 08:22:23.0   311      3 0542+498=3C147           18216  [0,1,2,3,4,5,6,7]  [10, 10, 10, 10, 10, 10, 10, 10] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
              08:22:23.0 - 08:23:52.0   312      3 0542+498=3C147           18216  [0,1,2,3,4,5,6,7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
              08:23:52.0 - 08:25:21.0   313      3 0542+498=3C147           18216  [0, 1, 2, 3, 4, 5, 6, 7]  [9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89, 9.89] [CALIBRATE_AMPLI.UNSPECIFIED,CALIBRATE_BANDPASS.UNSPECIFIED,UNSPECIFIED.UNSPECIFIED]
           (nRows = Total number of rows per scan) 
Fields: 3
  ID   Code Name                RA               Decl           Epoch   SrcId      nRows
  1    D    J1925+2106          19:25:59.605371 + J2000   1        1004816
  2    NONE G55.7+3.4           19:21:40.000000 + J2000   2        6248936
  3    N    0542+498=3C147      05:42:36.137916 + J2000   3          90096
Spectral Windows:  (8 unique spectral windows and 1 unique polarization setups)
  SpwID  Name      #Chans   Frame   Ch0(MHz)  ChanWid(kHz)  TotBW(kHz) CtrFreq(MHz)  Corrs          
  0      Subband:3     64   TOPO    1000.000      2000.000    128000.0   1063.0000   RR  RL  LR  LL
  1      Subband:1     64   TOPO    1128.000      2000.000    128000.0   1191.0000   RR  RL  LR  LL
  2      Subband:0     64   TOPO    1256.000      2000.000    128000.0   1319.0000   RR  RL  LR  LL
  3      Subband:2     64   TOPO    1384.000      2000.000    128000.0   1447.0000   RR  RL  LR  LL
  4      Subband:3     64   TOPO    1520.000      2000.000    128000.0   1583.0000   RR  RL  LR  LL
  5      Subband:1     64   TOPO    1648.000      2000.000    128000.0   1711.0000   RR  RL  LR  LL
  6      Subband:0     64   TOPO    1776.000      2000.000    128000.0   1839.0000   RR  RL  LR  LL
  7      Subband:2     64   TOPO    1904.000      2000.000    128000.0   1967.0000   RR  RL  LR  LL
Sources: 24
  ID   Name                SpwId RestFreq(MHz)  SysVel(km/s) 
  1    J1925+2106          0     -              -            
  1    J1925+2106          1     -              -            
  1    J1925+2106          2     -              -            
  1    J1925+2106          3     -              -            
  1    J1925+2106          4     -              -            
  1    J1925+2106          5     -              -            
  1    J1925+2106          6     -              -            
  1    J1925+2106          7     -              -            
  2    G55.7+3.4           0     -              -            
  2    G55.7+3.4           1     -              -            
  2    G55.7+3.4           2     -              -            
  2    G55.7+3.4           3     -              -            
  2    G55.7+3.4           4     -              -            
  2    G55.7+3.4           5     -              -            
  2    G55.7+3.4           6     -              -            
  2    G55.7+3.4           7     -              -            
  3    0542+498=3C147      0     -              -            
  3    0542+498=3C147      1     -              -            
  3    0542+498=3C147      2     -              -            
  3    0542+498=3C147      3     -              -            
  3    0542+498=3C147      4     -              -            
  3    0542+498=3C147      5     -              -            
  3    0542+498=3C147      6     -              -            
  3    0542+498=3C147      7     -              -            
Antennas: 27:
  ID   Name  Station   Diam.    Long.         Lat.                Offset from array center (m)                ITRF Geocentric coordinates (m)        
                                                                     East         North     Elevation               x               y               z
  0    ea01  W09       25.0 m   -  +       -521.9416     -332.7766       -1.2001 -1601710.017000 -5042006.925200  3554602.355600
  1    ea02  E02       25.0 m   -  +          9.8240      -20.4293       -2.7806 -1601150.060300 -5042000.619800  3554860.729400
  2    ea03  E09       25.0 m   -  +        506.0564     -251.8670       -3.5825 -1600715.950800 -5042273.187000  3554668.184500
  3    ea04  W01       25.0 m   -  +        -27.3562      -41.3030       -2.7418 -1601189.030140 -5042000.493300  3554843.425700
  4    ea05  W08       25.0 m   -  +       -432.1167     -272.1478       -1.5054 -1601614.091000 -5042001.652900  3554652.509300
  5    ea06  N06       25.0 m   -  +        -54.0649      263.8778       -4.2273 -1601162.591000 -5041828.999000  3555095.896400
  6    ea07  E05       25.0 m   -  +        164.9788      -92.8032       -2.5268 -1601014.462000 -5042086.252000  3554800.799800
  7    ea08  N01       25.0 m   -  +        -30.8810       -1.4664       -2.8597 -1601185.634945 -5041978.156586  3554876.424700
  8    ea09  E06       25.0 m   -  +        236.9058     -126.3369       -2.4443 -1600951.588000 -5042125.911000  3554773.012300
  9    ea10  N03       25.0 m   -  +        -39.0773       93.0192       -3.3330 -1601177.376760 -5041925.073200  3554954.584100
  10   ea11  E04       25.0 m   -  +        102.8054      -63.7682       -2.6414 -1601068.790300 -5042051.910200  3554824.835300
  11   ea12  E08       25.0 m   -  +        407.8285     -206.0065       -3.2272 -1600801.926000 -5042219.366500  3554706.448200
  12   ea13  N07       25.0 m   -  +        -61.1037      344.2331       -4.6138 -1601155.635800 -5041783.843800  3555162.374100
  13   ea15  W06       25.0 m   -  +       -275.8288     -166.7451       -2.0590 -1601447.198000 -5041992.502500  3554739.687600
  14   ea16  W02       25.0 m   -  +        -67.9687      -26.5614       -2.7175 -1601225.255200 -5041980.383590  3554855.675000
  15   ea17  W07       25.0 m   -  +       -349.9877     -216.7509       -1.7975 -1601526.387300 -5041996.840100  3554698.327400
  16   ea18  N09       25.0 m   -  +        -77.4346      530.6273       -5.5859 -1601139.485100 -5041679.036800  3555316.533200
  17   ea19  W04       25.0 m   -  +       -152.8599      -83.8054       -2.4614 -1601315.893000 -5041985.320170  3554808.304600
  18   ea20  N05       25.0 m   -  +        -47.8454      192.6015       -3.8723 -1601168.786100 -5041869.054000  3555036.936000
  19   ea21  E01       25.0 m   -  +        -23.8638      -81.1510       -2.5851 -1601192.467800 -5042022.856800  3554810.438800
  20   ea22  N04       25.0 m   -  +        -42.6239      132.8436       -3.5494 -1601173.979400 -5041902.657700  3554987.517500
  21   ea23  E07       25.0 m   -  +        318.0509     -164.1850       -2.6957 -1600880.571400 -5042170.388000  3554741.457400
  22   ea24  W05       25.0 m   -  +       -210.0959     -122.3887       -2.2577 -1601377.009500 -5041988.665500  3554776.393400
  23   ea25  N02       25.0 m   -  +        -35.6245       53.1806       -3.1345 -1601180.861480 -5041947.453400  3554921.628700
  24   ea26  W03       25.0 m   -  +       -105.3447      -51.7177       -2.6037 -1601265.153600 -5041982.533050  3554834.858400
  25   ea27  E03       25.0 m   -  +         50.6641      -39.4835       -2.7273 -1601114.365500 -5042023.151800  3554844.944000
  26   ea28  N08       25.0 m   -  +        -68.9057      433.1889       -5.0602 -1601147.940400 -5041733.837000  3555235.956000
##### End Task: listobs              #####

We can see that there are three sources in this observation:

  • J1925+2106, field ID 1: the phase calibrator;
  • G55.7+3.4, field ID 2: the supernova remnant;
  • 0542+498=3C147, field ID 3: the flux and bandpass calibrator.

We can also see that these sources have associated "scan intents", which indicate their function in the observation. Note that you can select sources based on their intents in certain CASA tasks. The various scan intents in this dataset are:

  • CALIBRATE_PHASE indicates that this is a scan to be used for gain calibration;
  • OBSERVE_TARGET indicates that this is the science target;
  • CALIBRATE_AMPLI indicates that this is to be used for flux calibration; and
  • CALIBRATE_BANDPASS indicates that these scans are to be used for bandpass calibration.

Note that 3C147 is to be used for both flux and bandpass calibration.

We can see the antenna configuration for this observation using plotants:

plotants image

This shows that antennas ea01, ea18, and ea03 were on the extreme ends of the west, north, and east arms, respectively. The antenna position diagram is particularly useful as a guide to help determine which antenna to use as the reference antenna later during calibration.

We may also inspect the raw data using plotms. To start with, let's look at a subset of scans on the supernova remnant:

plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       antenna='ea24', xaxis='freq', yaxis='amp', coloraxis='spw',
       iteraxis='scan', correlation='RR,LL', symbolshape='circle')

The coloraxis parameter indicates that a different color will be assigned to each spectral window, and the iteraxis parameter tells plotms to display a new plot for each scan. We have chosen only one antenna (ea24) and just the right and left circular polarizations (without the cross-hand terms) to reduce the amount of data in the selection. One can flip through these plots using the green arrows located at the bottom of the plotting GUI: the double-left arrow will display the very first plot in the set, the single left arrow will go back one plot, and the right arrows have similar behavior for moving forward in the set. As of CASA 4.4 some of the tabs on the left have changed compared to earlier versions. "Iter" has been changed to "Page" as more options became available. And we introduced "Calibration" to upload a cal-library file.

plotms image

Flipping through the scans, it's clear that there is significant time- and frequency-variable RFI present in this observation. Since this is L-band data taken in the most compact EVLA configuration ("D"), this comes as no surprise. However, it also poses one of the greatest challenges for obtaining a good image.

In particular, we can see that two spectral windows (SPWs) are quite badly affected. To determine which these are, click in the "Mark Regions" tool at the bottom of the plotms GUI (the open box with a green "plus" sign), and use the mouse to select a few of the highest-amplitude points in each of these SPWs. Click on the "Locate" button (magnifying glass sign), and information associated with the selected points will be displayed in the logger window:

Frequency in [1.22177 1.27139] or [1.5762 1.65063], Amp in [23.1713 24.3056] or [59.6296 63.6806]:
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:20:57.0 BL=ea12@E08 & ea24@W05[11&22] Spw=1 Chan=59 Freq=1.246 Corr=RR X=1.246 Y=23.5243  (38134/11/1526)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:07.0 BL=ea03@E09 & ea24@W05[2&22] Spw=1 Chan=59 Freq=1.246 Corr=RR X=1.246 Y=23.6116  (40310/12/374)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:07.0 BL=ea12@E08 & ea24@W05[11&22] Spw=1 Chan=59 Freq=1.246 Corr=RR X=1.246 Y=23.4432  (41462/12/1526)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:57.0 BL=ea03@E09 & ea24@W05[2&22] Spw=1 Chan=59 Freq=1.246 Corr=RR X=1.246 Y=23.7536  (56950/17/374)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:07.0 BL=ea12@E08 & ea24@W05[11&22] Spw=4 Chan=41 Freq=1.602 Corr=RR X=1.602 Y=61.9097  (131282/39/1490)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:17.0 BL=ea12@E08 & ea24@W05[11&22] Spw=4 Chan=41 Freq=1.602 Corr=RR X=1.602 Y=61.1769  (134610/40/1490)
Scan=30 Field=G55.7+3.4[2] Time=2010/08/23/01:21:27.0 BL=ea12@E08 & ea24@W05[11&22] Spw=4 Chan=41 Freq=1.602 Corr=RR X=1.602 Y=60.1834  (137938/41/1490)
Found 7 points (7 unflagged) among 239616 in 0.02s.

We can see that SPWs 1 and 4 are among the worst affected by RFI. (As an aside, note that the syntax for reporting a selected point's baseline is {antenna 1 name}@{pad 1 name} &{antenna 2 name}@{pad 2 name}[{antenna 1 index}&{antenna 2 index}].) At this point, feel free to play around a bit more with plotms; you might try experimenting with different axes for iteration (under the "Iter" left-hand tab), different data selection parameters (under "Data"), different axes ("Axes"), different averaging techniques (under "Data"), or different selections for the coloraxis (the "colorize" option under "Display").

A priori calibration and flagging

Before we proceed with further processing, we should check the observation log to see if there were any issues noted during the run that need to be addressed. The observing log file is linked to the archive web page for this observation (at far right; under "logs etc."). Looking at the log, we can see that antenna ea07 may need a position correction, and antennas ea06, ea17, ea20, and ea26 did not have L-band receivers installed at the time and should be flagged.

Antenna position correction

Correcting a known position error for an antenna is done with the task gencal. This is important, because the observed visibilities are a function of [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math]. If an antenna's position is incorrect, then [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] will be calculated incorrectly, and there will be errors in any image derived from the data. Of course, the a priori position corrections may not completely account for all errors.

The gencal task will query the VLA Baseline Corrections database to determine what baseline corrections to apply to the dataset. If you wish to double-check this by hand, refer to the EVLA/VLA Baseline Corrections page.

gencal(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.pos',

As reported by the CASA logger, gencal found a position correction for antenna ea07 of (x, y, z) = (0.0087, 0.0137, 0.000) and recorded this in our specified calibration table.

Gain curve and opacity correction

A decision has been made here to ignore both the corrections for atmospheric opacity and for the elevation-dependent telescope gain throughout this tutorial. These effects are very small (less than or about 1%) across the frequency range of this observation (1-2 GHz). At higher frequencies, these corrections may be important and the appropriate calibration tables can be computed using the task gencal. For an example of a tutorial which corrects for atmospheric opacity and the elevation-dependent gain see this [this tutorial.]

Flagging non-operational antennas

In addition to updating the position for antenna ea07, we have to flag antennas ea06, ea17, ea20, and ea26, since these did not have working L-band receivers at the time of observation. We do this with the task flagdata:

flagdata(vis='G55.7+3.4_10s.ms', mode='manual',

Note that the first thing flagdata does is create a backup flag file, in this case named "flagdata_1". This flag file contains a copy of the flags present in the MS prior to the requested flagging operation, and can be found inside the <MS_name>.flagversions directory, along with any other backed up flag files. Since these flag files take up a fair amount of space (in this particular case, 230 MB), we won't me making them every time we run flagdata -- the automatic flag backup can be turned off by setting flagbackup=False. However, it's good to keep a record of the names of the backup files and the associated processing step, in case you wish to restore a previous version of the flags using the flagmanager task.

Flagging shadowed antennas and zero-amplitude data

Since this is the most compact EVLA configuration, there may be instances where one antenna blocks, or "shadows" another. Therefore, we will run flagdata to remove these data:

flagdata(vis='G55.7+3.4_10s.ms', mode='shadow',

In this particular observation, there does not appear to be any data affected by shadowing, as can be seen in the logger report.

In addition, there may be times during which the correlator writes out pure zero-valued data. In order to remove this bad data, we run flagdata to remove any pure zeroes:

flagdata(vis='G55.7+3.4_10s.ms', mode='clip',
         clipzeros=True, flagbackup=False)

Inspecting the logger output which is generated by flagdata shows that there is a very small quantity of zero-valued data (0.02%) present in this MS.

Note that the archive will automatically flag shadowed antennas as well as zero-valued data, if you request that online flags are applied.

Automatic RFI excision

Now, we move on to one of the most difficult parts of L-band, D-configuration data processing: excising the RFI. For the original reduction of this MS, flagging was done by hand and took several weeks. The resulting data are offered as an option for the imaging stage of this tutorial (because careful by-hand flagging does yield a better image); however, it's not always practical to undertake this endeavor, and often the "automatic" flagging provides a reasonable (and much less time-consuming) solution. Therefore, we will demonstrate the use of the automatic RFI excision tools currently available in CASA.

Hanning-smoothing data

Prior to flagging any data which is affect by strong RFI, one should Hanning-smooth the data to remove Gibbs ringing. This is done with the task hanningsmooth, which can either write a new, Hanning-smoothed MS or directly operate on the requested column of the input MS. To conserve space, we will request the latter. Note that if you wish to make your own "before" and "after" plots, you should make the first prior to running hanningsmooth, since you will be overwriting the non-Hanning-smoothed data in the process -- and this is not reversible.

plotms(vis='G55.7+3.4_10s.ms', scan='30', antenna='ea24', spw='0~2',
    xaxis='freq', yaxis='amp', coloraxis='spw', symbolshape = 'circle', 
    correlation='RR,LL', plotrange=[1.0,1.27,-0.3,2.5], 

hanningsmooth(vis='G55.7+3.4_10s.ms', datacolumn='data')

plotms(vis='G55.7+3.4_10s.ms', scan='30', antenna='ea24', spw='0~3', 
    xaxis='freq', yaxis='amp', coloraxis='spw', symbolshape = 'circle',
    correlation='RR,LL', plotrange=[1.0,1.27,-0.3,2.5], 
before Hanning smoothing
after Hanning smoothing

Task hanningsmooth will take a few minutes to run. Note that the 2nd plotms command above contains a trivial change in the spw selection (trivial because the 4th spw is outside of the specified plotrange). This forces plotms to reload the plot since by default, plotms will not redraw a plot if the input parameters are unchanged. In this case, since the data column was changed between calls to plotms, a redraw is necessary. When using the GUI, you can simply check "force reload" in the bottom left corner of the side bar before clicking "Plot."

We can compare the Hanning-smoothed data with the raw data by plotting a subset of data to show the result of Hanning-smoothing (see plots to the left and right). As you can see, the smoothing has spread the single-channel RFI into three channels, but has also removed the effects of some of the worst RFI from a number of channels. Overall, this will improve our ability to flag RFI from the data and retain as much good data as possible.

Using the phase calibration source for preliminary bandpass calibration

In order to get the best possible result from the automatic RFI excision, we will first apply bandpass calibration to the MS. Since the RFI is time-variable, using the phase calibration source to make an average bandpass over the entire observation will mitigate the amount of RFI present in the calculated bandpass. (For the final calibration, we will use the designated bandpass source 3C147; however, since this object was only observed in the last set of scans, it doesn't sample the time variability and would not provide a good average bandpass.)

Since there are likely to be gain variations over the course of the observation, we will run gaincal to solve for an initial set of antenna-based phases over a narrow range of channels. These will be used to create the bandpass solutions. While amplitude variations will have little effect on the bandpass solutions, it is important to solve for these phase variations with sufficient time resolution to prevent decorrelation when vector averaging the data in computing the bandpass solutions.

In order to choose a narrow range of channels for each spectral window which are relatively RFI-free over the course of the observation, we can look at the data with plotms. Note that it's important to only solve for phase using a narrow channel range, since an antenna-specific delay will cause the phase to vary with respect to frequency over the spectral window, perhaps by a substantial amount.

plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       antenna='ea24', xaxis='channel', yaxis='amp', iteraxis='spw',
       yselfscale=True, correlation='RR,LL', symbolshape='circle')
  • yselfscale=True: sets the y-scaling to be for the currently displayed spectral window, since some spectral windows have much worse RFI and will skew the scale for others.

Looking at these plots, we can choose appropriate channel ranges for each SPW:

SPW 0: 10-13
SPW 1: 30-33
SPW 2: 32-35
SPW 3: 30-33
SPW 4: 35-38
SPW 5: 30-33
SPW 6: 30-33
SPW 7: 46-49

Using these channel ranges, we run gaincal to calculate phase-only solutions that will be used as input during our initial bandpass calibration. Remember - the calibration tables we are creating now are so that we can use automatic RFI flagging algorithms. Our final calibration tables will be generated later, after automated flagging. Here are the inputs for our initial pre-bandpass phase calibration:

gaincal(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.initPh',
        intent='CALIBRATE_PHASE*', solint='int',
        refant='ea24', minblperant=3,
        minsnr=3.0, calmode='p', gaintable='G55.7+3.4_10s.pos')
  • caltable='G55.7+3.4_10s.initPh': this is the output calibration table that will be written.
  • intent='CALIBRATE_PHASE*': this is the way we have chosen to select data. Alternatively, we could have used "field='J1925+2106'", since this is the only source with the CALIBRATE_PHASE* scan intent. Note the use of the wildcard character "*" at the end of the string; this accounts for the fact that all the intents end with ".UNSPECIFIED". We could just as well have used "*PHASE*".
  • solint='int': we request a solution for each 10-second integration.
  • spw='0:10~13,1;3;5~6:30~33,2:32~35,4:35~38,7:46~49': note the syntax of this selection: a ":" is used to separate the SPW from channel selection, ";" is used to separate within this selection, and "~" is used to indicate an inclusive range.
  • refant='ea24': we have chosen ea24 as the reference antenna after inspecting the antenna position diagram (see above). It is relatively close to, but not directly in, the center of the array, which could be important in D-configuration, since you don't want the reference antenna to have a high probability of being shadowed by nearby antennas.
  • minblperant=3: the minimum number of baselines which must be present to attempt a phase solution.
  • minsnr=3.0: the minimum signal-to-noise a solution must have to be considered acceptable. Note that solutions which fail this test will cause these data to be flagged downstream of this calibration step.
  • calmode='p': perform phase-only solutions.
  • gaintable='G55.7+3.4_10s.pos': use the antenna position correction for ea07 that we created earlier.

Note that a number of solutions do not pass the requirements of the minimum 3 baselines (generating the terminal message "Insufficient unflagged antennas to proceed with this solve.") or minimum signal-to-noise ratio (outputting "n of x solutions rejected due to SNR < 3 ..."). A particularly large number of solutions are rejected in SPW 4, where the RFI is most severe.

Phases for antenna ea09
Phases in SPW 4

We can inspect the resulting calibration table with plotcal:

plotcal(caltable='G55.7+3.4_10s.initPh', xaxis='time', yaxis='phase',
        iteration='antenna', spw='0', plotrange=[-1,-1,-180,180])

This iterates over antenna for a single spectral window; we can see that the phase does not change much over the course of the observation for SPW 0. We may also iterate over spectral window for a subset of antennas:

plotcal(caltable='G55.7+3.4_10s.initPh', xaxis='time', yaxis='phase',
        iteration='spw', antenna='ea01,ea05,ea24', plotrange=[-1,-1,-180,180])

Clearly, the phases are affected by RFI in some places, especially in SPW 4.

Using this phase information, we create time-averaged bandpass solutions for the phase calibration source:

bandpass(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.initBP',
         intent='CALIBRATE_PHASE*', solint='inf',
         combine='scan', refant='ea24', minblperant=3, minsnr=10.0,
         gaintable=['G55.7+3.4_10s.pos', 'G55.7+3.4_10s.initPh'],
         interp=['', 'nearest'], solnorm=False)
  • solint='inf', combine='scan': the solution interval of 'inf' will automatically break by scans; this requests that the solution intervals be combined over scans, so that we will get one solution per antenna.
  • gaintable=['G55.7+3.4_10s.pos', 'G55.7+3.4_10s.initPh']: we will pre-apply both the antenna position corrections as well as the initial phase solutions.
  • interp=['', 'nearest']: by default, gaincal will use linear interpolation for pre-applied calibration. However, we want the nearest phase solution to be used for a given time.

Again, we can see that a number of solutions have been rejected by our choices of minblperant and minsnr.

Bandpasses for antennas ea10 - ea19

We may plot the bandpasses with plotcal; first looking at the amplitudes:

plotcal(caltable='G55.7+3.4_10s.initBP', xaxis='freq', yaxis='amp',
        iteration='antenna', subplot=331)
  • subplot=331: displays 3x3 plots per screen

Also, we can look at the phase solutions:

plotcal(caltable='G55.7+3.4_10s.initBP', xaxis='freq', yaxis='phase',
        iteration='antenna', subplot=331)

We can see that SPW 4 is virtually wiped-out by RFI; furthermore, there are channels in SPW 1 that are consistently badly affected. Prior to running any automatic flagging, we will flag these manually. In addition, we will flag the first 9 channels of SPW 0, since this is affected by an issue which causes the noise to be substantially higher:

flagdata(vis='G55.7+3.4_10s.ms', spw='0:0~8,1:41~63,4')

Note that this has created a backup flag file called "flagdata_2". Now we apply the antenna position corrections and the bandpass calibration table to the data:

         gaintable=['G55.7+3.4_10s.pos', 'G55.7+3.4_10s.initBP'],

This operation will flag data that correspond to flagged solutions, so applycal makes a backup version of the flags prior to operating on the data -- in this case, it's called "before_applycal_1". Note that running applycal might take a little while, likely around 10 minutes.

To see the corrected data, we can plot the data as we did before, choosing ydatacolumn='corrected' this time:

plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       antenna='ea24', xaxis='freq', yaxis='amp', coloraxis='spw',
       iteraxis='scan', ydatacolumn='corrected')

Note that some of the worst RFI is no longer there; also note that the amplitude scale has changed, since the bandpass solutions include amplitude scaling.

Automatic flagging

Now that we have bandpass-corrected data with some of the worst RFI flagged out, we will run flagdata in mode='rflag'. Note that there are many parameters which may be modified:

default flagdata
#  flagdata :: All-purpose flagging task based on data-selections and flagging modes/algorithms
vis                 =         ''        #  Name of MS file to flag
mode                =    'rflag'        #  Flagging mode
                                        #   (list/manual/clip/shadow/quack/elevation/tfcrop/rflag/extend/unflag/summary)
     field          =         ''        #  Field names or field index numbers: '' ==> all, field='0~2,3C286'
     spw            =         ''        #  Spectral-window/frequency/channel: '' ==> all, spw='0:17~19'
     antenna        =         ''        #  Antenna/baselines: '' ==> all, antenna ='3,VA04'
     timerange      =         ''        #  Time range: '' ==> all,timerange='09:14:0~09:54:0'
     correlation    =         ''        #  Correlation: '' ==> all, correlation='XX,YY'
     scan           =         ''        #  Scan numbers: '' ==> all
     intent         =         ''        #  Observation intent: '' ==> all, intent='CAL*POINT*'
     array          =         ''        #  (Sub)array numbers: '' ==> all
     uvrange        =         ''        #  UV range: '' ==> all; uvrange ='0~100klambda', default units=meters
     observation    =         ''        #  Observation ID: '' ==> all
     feed           =         ''        #  Multi-feed numbers: Not yet implemented
     ntime          =     'scan'        #  Time-range to use for each chunk (in seconds or minutes)
     combinescans   =      False        #  Accumulate data across scans.
     datacolumn     =     'DATA'        #  Data column on which to operate (data,corrected,model,residual)
     winsize        =          3        #  Number of timesteps in the sliding time window [aips:fparm(1)]
     timedev        =         ''        #  Time-series noise estimate [aips:noise]
     freqdev        =         ''        #  Spectral noise estimate [aips:scutoff]
     timedevscale   =        5.0        #  Threshold scaling for timedev [aips:fparm(9)]
     freqdevscale   =        5.0        #  Threshold scaling for freqdev [aips:fparm(10)]
     spectralmax    =  1000000.0        #  Flag whole spectrum if freqdev is greater than spectralmax [aips:fparm(6)]
     spectralmin    =        0.0        #  Flag whole spectrum if freqdev is less than spectralmin [aips:fparm(5)]

action              =    'apply'        #  Action to perform in MS and/or in inpfile (none/apply/calculate)
     display        =         ''        #  Display data and/or end-of-MS reports at runtime (data/report/both).
     flagbackup     =       True        #  Back up the state of flags before the run

savepars            =      False        #  Save the current parameters to the FLAG_CMD table or to a file

Additional information on the algorithm used in RFlag, as well as the other available automatic flagging algorithm in flagdata ("TFCrop"), can be found on this webpage (sections 2.1.6 and 2.1.7).

Following are a set of flagdata commands which have been found to work reasonably well with these data. Please take some time to play with the parameters and the plotting capabilities. Since these runs set display='both' and action='calculate', the flags are displayed but not actually written to the MS. This allows one to try different sets of parameters before actually applying the flags to the data.

Some representative plots are also displayed. Each column displays an individual polarization product; since we're using all four, from left to right are RR, RL, LR, and LL. The first row shows the data with current flags applied, and the second includes the flags generated by flagdata. The x-axis is channel number (the spectral window ID is displayed in the top title) and the y-axis of the first two rows is all integrations included in a time "chunk", set by the ntime parameter. These are the data considered by the RFlag algorithm during its flagging process, and changes in ntime will have some (relatively small) affect on what data are flagged.

Each plot page displays data for a single baseline and time chunk. The buttons at the bottom allow one to step through baseline (backward as well as forward), SPW, scan, and field; "Stop Display" will continue the flagging operation without the GUI, and "Quit" aborts the run.

First, we will run flagdata with mode='rflag', using the default parameter values, for just one source (the supernova remnant) and spectral window (0):

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', field='2', 
         spw='0', datacolumn='corrected', 
         action='calculate', display='both',

While this is clearly picking up some RFI, much is being left untouched (see figure to left, below). After stepping through a few baselines and scans, hit "Quit" to stop the flagger.

Let's try making it more sensitive to deviations from the calculated RMS in frequency, setting both timedevscale and freqdevscale=1.5 (the default is 5.0):

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', field='2', 
         spw='0', datacolumn='corrected', 
         freqdevscale=1.5, timedevscale=1.5,
         action='calculate', display='both',
flagdata/rflag, default parameters
flagdata/rflag, cutoff of 1.5 sigma

Using a cutoff value of 1.5 sigma may seem a bit extreme, but as you can see from the figure on the right, it does a substantially better job of getting rid of the RFI in the badly affected SPW 0.

We now run flagdata to calculate and apply these flags for all data in SPW 0. Note that this will take a little while, and flags around 20% of the data:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='0', datacolumn='corrected', 
         freqdevscale=1.5, timedevscale=1.5,
         action='apply', display='')

Although RFlag has done a pretty good job of finding the bad data, some still remains. One way to delete it is to use the mode='extend' feature in flagdata, which can extend flags along a chosen axis. First, we will extend the flags across polarization, so if any one polarization is flagged, all data for that time / channel will be flagged:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='0', extendpols=True,
         action='apply', display='')

Now, we will extend the flags in time and frequency, using the "growtime" and "growfreq" parameters. For the data here, the RFlag algorithm seems most likely to miss RFI which should be flagged along more of the time axis, so we will try with growtime=50.0, which will flag all data for a given channel if more than 50% of that channel's time is already flagged, and growfreq=90.0, which will flag the entire spectrum for an integration if more than 90% of the channels in that integration are already flagged.

Again, first just have a look at the flags that will be generated before applying them:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='0', growtime=50.0, growfreq=90.0,
         action='calculate', display='data',

It still appears to be missing some RFI, but this is also a very badly-affected SPW, so leave it as is for now and run to apply the flags:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='0', growtime=50.0, growfreq=90.0,
         action='apply', display='')

Now, let's work on SPW 1, flipping through time, baseline, and fields to get a sense of how the flagging will go with these parameters:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='1', datacolumn='corrected', 
         freqdevscale=2.0, timedevscale=2.0,
         action='calculate', display='both',

Unfortunately, this SPW is very badly affected by RFI, and it does not seem possible to flag adequately with the automated task (and probably not by hand, either). In this case, we choose to manually flag the entire SPW:

flagdata(vis='G55.7+3.4_10s.ms', spw='1')

Moving on to SPW 2:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='2', datacolumn='corrected', 
         freqdevscale=5.0, timedevscale=4.0,
         action='calculate', display='both',

Since the RFI is narrower and more pronounced in this frequency range, we have increased the RMS cutoff for both the time and frequency calculations to avoid over-flagging and deleting good data.

After checking the data and changing the parameters until you're happy, apply these flags:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='2', datacolumn='corrected', 
         freqdevscale=5.0, timedevscale=4.0,

Again, extend the flags along polarization:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='2', extendpols=True,
         action='apply', display='')

Try extending in frequency and time, as before:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='2', growtime=50.0, growfreq=90.0,
         action='calculate', display='data',

This looks pretty good, so let's apply it and have a look in plotms:

flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='2', growtime=50.0, growfreq=90.0,
plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       xaxis='baseline', yaxis='amp', spw='2', symbolshape = 'circle',
       iteraxis='scan', correlation='RR,LL', coloraxis = 'baseline')

Although we're trying to avoid doing this too much, it appears that there is one baseline which is consistently higher-amplitude than the others, indicating that it's probably contaminated by RFI. Use the plotms tools to identify this baseline, which turns out to be ea04 and ea16, and flag it:

flagdata(vis='G55.7+3.4_10s.ms', antenna='ea04&ea16',

We could have narrowed this further by channel and perhaps time, but remember: this tutorial is about the quick-and-dirty way of flagging data! With this in mind, we move on to SPW 3. Note that this time, we only look at data from the supernova remnant (target) field.

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='3', datacolumn='corrected', field='2',
         freqdevscale=5.0, timedevscale=4.0,
         action='calculate', display='data',

The parameters we used for SPW 2 seem to work well for SPW 3 also. Go ahead and flag, then extend as before:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='3', datacolumn='corrected',
         freqdevscale=5.0, timedevscale=4.0,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='3', extendpols=True,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='3', growtime=50.0, growfreq=90.0,

Recall that we already deleted SPW 4 due to bad RFI, so we only have 5-7 remaining. SPWs 5 and 6 have similar RFI properties to 2 and 3, so let's use the same RFlag parameters for these (feel free to play with this a bit yourself, if you like, to try to optimize):

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='5~6', datacolumn='corrected',
         freqdevscale=5.0, timedevscale=4.0,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='5~6', extendpols=True,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='5~6', growtime=50.0, growfreq=90.0,

However, SPW 7 is a bit more affected, and we may wish to use a somewhat lower threshold to catch all the RFI. First, try with the same parameters:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='7', datacolumn='corrected', field='2',
         freqdevscale=5.0, timedevscale=4.0,
         action='calculate', display='data',

Indeed, this seems to be missing a lot of the RFI. Try less stringent limits:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='7', datacolumn='corrected', field='2',
         freqdevscale=1.0, timedevscale=1.0,
         action='calculate', display='data',

This looks pretty good. Check the calibrator sources to be sure it works for them as well:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='7', datacolumn='corrected', field='1',
         freqdevscale=1.0, timedevscale=1.0,
         action='calculate', display='data',
flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='7', datacolumn='corrected', field='3',
         freqdevscale=1.0, timedevscale=1.0,
         action='calculate', display='data',

These seem reasonable as well, though it's apparent that 3C147 was very affected, possibly because of its low elevation at the time of the observation. Apply and extend the flags:

flagdata(vis='G55.7+3.4_10s.ms', mode='rflag', 
         spw='7', datacolumn='corrected',
         freqdevscale=1.0, timedevscale=1.0,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='7', extendpols=True,
flagdata(vis='G55.7+3.4_10s.ms', mode='extend', 
         spw='7', growtime=50.0, growfreq=90.0,

Evaluating results & further manual flagging

Now, we will use flagdata to see a summary of how much data we have flagged:

flagInfo = flagdata(vis='G55.7+3.4_10s.ms', mode='summary')

Using the information stored in the flagInfo Python dictionary, we can calculate and print out some interesting statistics:

print("\n %2.1f%% of G55.7+3.4, %2.1f%% of 3C147, and %2.1f%% of J1925+2106 are flagged. \n" % (100.0 * flagInfo['field']['G55.7+3.4']['flagged'] / flagInfo['field']['G55.7+3.4']['total'], 100.0 * flagInfo['field']['0542+498=3C147']['flagged'] / flagInfo['field']['0542+498=3C147']['total'], 100.0 * flagInfo['field']['J1925+2106']['flagged'] / flagInfo['field']['J1925+2106']['total']))
print("Spectral windows are flagged as follows:")
for spw in range(0,8):
     print("SPW %s: %2.1f%%" % (spw, 100.0 * flagInfo['spw'][str(spw)]['flagged'] / flagInfo['spw'][str(spw)]['total']))

So, as a result of the flagging thus far, we have sacrificed a bit over half of all the data. Let's see how well it has been cleaned up, using plotms:

plotms(vis='G55.7+3.4_10s.ms', scan='165', spw='0,2~3,5~7',
       antenna='ea24', xaxis='freq', yaxis='amp',
       correlation='RR,LL', coloraxis='spw', symbolshape = 'circle')

Unfortunately, despite our best autoflagging efforts, SPW 0 still looks pretty bad. (Take heart -- even the by-hand flagging did not work out well for this one.) So, we will flag the rest of SPW 0:

flagdata(vis='G55.7+3.4_10s.ms', spw='0')
after flagging screenshot
before flagging screenshot

After this is complete, refresh the plotms window using Shift + Plot. This generates the plot to the left. Just to compare with the unflagged data, we will restore the original flags, and have a look at the same slice. Be sure to save the current flags first!

flagmanager(vis='G55.7+3.4_10s.ms', mode='save',
flagmanager(vis='G55.7+3.4_10s.ms', mode='restore', 
plotms(vis='G55.7+3.4_10s.ms', scan='165', spw='2~3,5~7',
       antenna='ea24', xaxis='freq', yaxis='amp',
       correlation='RR,LL', coloraxis='spw', symbolshape = 'circle')

The pre-flagging plot is shown on the right. Clearly, a lot of the RFI has been excised. Restore the flags:

flagmanager(vis='G55.7+3.4_10s.ms', mode='restore', 

Other instructive ways to view the data are by baseline and uv-distance. Note that we're plotting all baselines in these plots, rather than just baselines to ea24 as before.

plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       xaxis='baseline', yaxis='amp', spw='2~3,5~7', iteraxis='spw',
       correlation='RR,LL', coloraxis='antenna1', symbolshape = 'circle')

No particular baselines look bad enough to flag outright, so we will leave this as is. Feel free to do some more flagging if you like. Now, let's plot as a function of uv-distance:

plotms(vis='G55.7+3.4_10s.ms', scan='30,75,120,165,190,235,303',
       xaxis='uvdist', yaxis='amp', spw='2~3,5~7', iteraxis='spw',
       correlation='RR,LL', coloraxis='antenna1', symbolshape = 'circle')

Again, nothing really sticks out as obviously in need of flagging. Clearly, there is still some residual RFI left here and there -- however, for the purposes of this tutorial, we will accept the current state of the flags and move on. Feel free to hunt and excise some more, if desired.

Calibrating data

Now that we are satisfied with the RFI excision, we will move on to the calibration stage.

Setting the flux density scale

Since we will be using 3C147 as the source of the absolute flux scale for this observation, we must first run setjy to set the appropriate model amplitudes for this source.

If the flux calibrator is spatially resolved, it is necessary to include a model image as well. Although 3C147 is not resolved at L-band in D configuration, we include the model image here for completeness.

First, we use the listmodimages parameter to find the model image path:

setjy(vis='G55.7+3.4_10s.ms', listmodels=True)

This lists any images in the current working directory as well as images in CASA's repository. In this second list, we see that there is "3C147_L.im", which is appropriate for our flux calibrator and band, and that it is in the directory "/home/casa/packages/RHEL6/release/casa-release-4.4.0/data/nrao/VLA/CalModels ". We can optionally give the full path of the model image, but setjy should now be able to locate it by name alone:

setjy(vis='G55.7+3.4_10s.ms', field='0542*', scalebychan=True,
      spw='2~3,5~7', model='3C147_L.im')
  • scalebychan=True: scales the model flux density value for each channel.

Note: The task setjy uses the Perley-Butler 2010 standard by default. Periodically, the flux density scale at the VLA is revised, updated, or expanded. The most recent standard is Perley-Butler 2013, and can be used by explicitly setting standard='Perley-Butler 2013' in the task. See help setjy for more details.

Delay and Bandpass calibration

We will follow a similar procedure as the one outlined above for calculating the antenna bandpasses, except that this time, we will use the actual designated bandpass calibration source, 3C147. Although the phase calibration source has the advantage of having been observed throughout the run, it has an unknown spectrum which could introduce amplitude slopes to each spectral window. We will also calibrate the residual antenna-based delays (for further information on this topic, please see this tutorial: [[1]].)

As before, we first generate a phase-only gain calibration table that will be used to help smooth-out the phases before running bandpass itself:

gaincal(vis='G55.7+3.4_10s.ms', intent='*BANDPASS*', 
        solint='int', refant='ea24',
        minblperant=3, minsnr=3.0, calmode='p',

Unfortunately, you will notice a lot of messages that read "Insufficient unflagged antennas to proceed with this solve" for SPW 7. This indicates that too much data have been flagged to perform the gaincal operation. This also suggests that the spectral window is too badly affected by RFI to be useful for imaging -- so, we will flag the rest of the SPW before continuing with further analysis:

flagdata(vis='G55.7+3.4_10s.ms', spw='7')

We can now solve for the residual antenna-based delays that can be seen in plots of the phase vs. frequency for the calibrator sources in plotms. This uses the gaintype='K' option in gaincal. Note that this currently does not do a "global fringe-fitting" solution for delays, but instead does a baseline-based delay solution to all baselines to the refant, treating these as antenna-based delays. In most cases with high-enough S/N to get baseline-based delay solutions this will suffice. We use our bright bandpass calibrator, 3C147, to calibrate the delays:

gaincal(vis='G55.7+3.4_10s.ms', intent='*BANDPASS*', 
        spw='2~3,5~6:4~59', solint='inf', refant='ea24', 
        gaintype='K', combine='scan', minsnr=3, 

We pre-apply our initial phase table, and produce a new K-type caltable for input to bandpass calibration. We can plot the delays, in nanoseconds, as a function of antenna index (you will get one for each sub-band and polarization):

plotcal(caltable='G55.7+3.4_10s.K0', xaxis='antenna', yaxis='delay')

The delays range from around -3 to 5 nanoseconds.

Now let's solve for the bandpass using the previous tables:

bandpass(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.bPass',
         intent='*BANDPASS*', solint='inf', spw='2~3,5~6',
         combine='scan', refant='ea24', minblperant=3, minsnr=10.0,
         interp=['', 'nearest', 'nearest'], solnorm=False)
  • solint='inf', combine='scan': again, the solution interval of 'inf' will automatically break-up the data by scans; this requests that the solution intervals be combined over scans, so that we will get one solution per antenna. Note that you must set solnorm=False here or later on you will find some offsets

between spws due to the way in which amplitude scaling adjusts weights internally during solving.

bandpass amplitudes
bandpass phases

Note that since we have flagged-out the vast majority of the RFI-affected data, there are many fewer failed solutions. Again, we can plot the calculated bandpasses to check that they look reasonable:

plotcal(caltable='G55.7+3.4_10s.bPass', xaxis='freq', yaxis='amp',
        iteration='antenna', subplot=331)
plotcal(caltable='G55.7+3.4_10s.bPass', xaxis='freq', yaxis='phase',
        iteration='antenna', subplot=331)

Don't let the apparently odd-looking phases for ea24 fool you -- check the scale! Remember, this is our reference antenna.

Gain calibration

Next, we will calculate the per-antenna gain solutions. Since this is low-frequency data, we do not expect substantial variations over short timescales, so we calculate one solution per scan (using "solint='inf'"):

gaincal(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.phaseAmp',
        intent='*PHASE*,*AMPLI*', spw='2~3,5~6', solint='inf', refant='ea24', minblperant=3,
        minsnr=10.0, gaintable=['G55.7+3.4_10s.pos','G55.7+3.4_10s.K0','G55.7+3.4_10s.bPass'])
  • solint='inf': we request one solution per scan.

Plot these solutions as a function of amplitude and phase versus time for the phase calibrator (field 1), iterating over each antenna:

plotcal(caltable='G55.7+3.4_10s.phaseAmp', xaxis='time', yaxis='amp',
        field = '1', iteration='antenna')

plotcal(caltable='G55.7+3.4_10s.phaseAmp', xaxis='time', yaxis='phase',
        field = '1', iteration='antenna')

Flux scaling the gain solutions

Now that we have a complete set of gain solutions, we must scale the phase calibrator's absolute flux correctly, using 3C147 as our reference source. To do this, we run fluxscale on the gain table we just created, which will write a new, flux-corrected gain table:

myFlux = fluxscale(vis='G55.7+3.4_10s.ms', caltable='G55.7+3.4_10s.phaseAmp',
                   fluxtable='G55.7+3.4_10s.phaseAmp.fScale', reference='3', incremental=False)

Note that the myFlux Python dictionary will contain information about the scaled fluxes and fitted spectrum. The logger will display information about the flux density it has deduced for J1925+2106:

Found reference field(s): 0542+498=3C147
Found transfer field(s):  J1925+2106
Flux density for J1925+2106 in SpW=0 is:  INSUFFICIENT DATA 
Flux density for J1925+2106 in SpW=1 is:  INSUFFICIENT DATA 
Flux density for J1925+2106 in SpW=2 (freq=1.319e+09 Hz) is: 1.46218 +/- 0.0285158 (SNR = 51.2761, N = 40)
Flux density for J1925+2106 in SpW=3 (freq=1.447e+09 Hz) is: 1.53273 +/- 0.0250239 (SNR = 61.2506, N = 40)
Flux density for J1925+2106 in SpW=4 is:  INSUFFICIENT DATA 
Flux density for J1925+2106 in SpW=5 (freq=1.711e+09 Hz) is: 1.69881 +/- 0.0249164 (SNR = 68.1802, N = 40)
Flux density for J1925+2106 in SpW=6 (freq=1.839e+09 Hz) is: 1.75271 +/- 0.0265527 (SNR = 66.0087, N = 40)
Flux density for J1925+2106 in SpW=7 is:  INSUFFICIENT DATA 
Fitted spectrum for J1925+2106 with fitorder=1: Flux density = 1.60735 +/- 0.00466044 (freq=1.56544 GHz) spidx=0.560869 +/- 0.0227406

The flux density listed in the VLA Calibrator Manual for this source is around the same magnitude:

1925+211   J2000  A 19h25m59.605370s  21d06'26.162180"  Aug01         
1923+210   B1950  A 19h23m49.792400s  21d00'23.305000"
BAND        A B C D    FLUX(Jy)    UVMIN(kL)  UVMAX(kL)
 20cm    L  P S S S       1.30                       visplot

So we should be satisfied that our calibration up to this point is reasonable.

Applying calibration

Finally, we must apply the calibration to our data. To do this, we run applycal in two stages: the first is to self-calibrate our calibration sources; the second, to apply calibration to the supernova remnant. These must be done separately, since we want to use "nearest" interpolation for the self-calibration and "linear" for the application to the science target:

applycal(vis='G55.7+3.4_10s.ms', spw='2~3,5~6', intent='*TARGET*',
         gaintable=['G55.7+3.4_10s.pos','G55.7+3.4_10s.K0','G55.7+3.4_10s.bPass', \
                    'G55.7+3.4_10s.phaseAmp.fScale'], calwt=False)
applycal(vis='G55.7+3.4_10s.ms', spw='2~3,5~6', intent='*PHASE*,*AMPLI*',
         gaintable=['G55.7+3.4_10s.pos','G55.7+3.4_10s.K0','G55.7+3.4_10s.bPass', \
                    'G55.7+3.4_10s.phaseAmp.fScale'], \
         calwt=False, interp=['','nearest','nearest','nearest'])

Plotting calibrated data

J1925+2106 corrected real vs. imag
J1925+2106 corrected amplitude vs. baseline
3C147 corrected real vs. imag
3C147 corrected amplitude vs. baseline

To check that everything has truly proceeded as well as we would like, this is a good time to look at the calibrated data in plotms. A very useful way to check the goodness of calibration is to plot the corrected real vs. imaginary portions of the visibilities (which, for a point source at the phase center, should look like scatter around zero imaginary (zero phase) and the real flux density (amplitude) of the source), and corrected amplitude vs. baseline, which should be a flat line of points for a point source, and will reveal any lingering antenna-based problems. For a resolved source, it may be more instructive to plot corrected amplitude vs. uv-distance.

plotms(vis='G55.7+3.4_10s.ms', field='1', xaxis='imag', yaxis='real',
       xdatacolumn='corrected', ydatacolumn='corrected', coloraxis='antenna1',
       avgchannel='10', avgtime='20', correlation='RR,LL', iteraxis='spw',
plotms(vis='G55.7+3.4_10s.ms', field='1', xaxis='baseline', yaxis='amp',
       xdatacolumn='corrected', ydatacolumn='corrected', coloraxis='antenna1',
       avgchannel='10', avgtime='20', correlation='RR,LL', iteraxis='spw',
plotms(vis='G55.7+3.4_10s.ms', field='3', xaxis='imag', yaxis='real',
       xdatacolumn='corrected', ydatacolumn='corrected', coloraxis='antenna1',
       avgchannel='10', avgtime='20', correlation='RR,LL', iteraxis='spw',
plotms(vis='G55.7+3.4_10s.ms', field='3', xaxis='baseline', yaxis='amp',
       xdatacolumn='corrected', ydatacolumn='corrected', coloraxis='antenna1',
       avgchannel='10', avgtime='20', correlation='RR,LL', iteraxis='spw',

Splitting out data for G55.7+3.4

Now that we are satisfied with the calibration, we will create a new MS which contains only the corrected data for G55.7+3.4 using the task split. This will substantially reduce the size of the MS, and will speed up the imaging process. We can also drop the polarization products since they have not been calibrated and will not be used for imaging.

split(vis='G55.7+3.4_10s.ms', field='2', keepflags=False,
      outputvis='G55.7+3.4.calib.ms', datacolumn='corrected',
      spw='2~3,5~6', correlation = 'RR,LL')


The size of the primary beam is 45 divided by the observed frequency in GHz, or around 30 arcmin. Since the observation was taken in D-configuration, we can check the Observational Status Summary's section on VLA resolution to find that the synthesized beam will be around 44 arcsec. We want to oversample the synthesized beam by a factor of around five, so we will use a cell size of 8 arcsec.

Since this field contains bright point sources significantly outside the primary beam, we will create images that are 170 arcminutes on a side, or almost 6 x the size of the primary beam. This is ideal for showcasing both the problems inherent in such wide-band, wide-field imaging, as well as some of the solutions currently available in CASA to deal with these issues.

First, it's worth considering why we are even interested in sources which are far outside the primary beam. This is mainly due to the fact that the EVLA, with its wide bandwidth capabilities, is quite sensitive even far from phase center -- for example, at our observing frequencies in L-band, the primary beam gain is as much as 10% around 1 degree away. That means that any imaging errors for these far-away sources will have a significant impact on the image rms at phase center. The error due to a source at distance R can be parametrized as:

[math]\displaystyle{ \Delta(S) = S(R) \times PB(R) \times PSF(R) }[/math]

So, for R = 1 degree, source flux S(R) = 1 Jy, [math]\displaystyle{ \Delta(S) }[/math] = 1 mJy − 100 [math]\displaystyle{ {\mu} }[/math]Jy. Clearly, this will be a source of significant error.

Multi-scale clean

G55.7+3.4 multiscale clean

Since G55.7+3.4 is an extended source with many spatial scales, the most basic (yet still reasonable) imaging procedure is to use clean with multiple scales. As is suggested, we will use a set of scales (which are expressed in units of the requested pixel, or cell, size) which are representative of the scales that are present in the data, including a zero-scale for point sources.

Note that interrupting clean by Ctrl+C may corrupt your visibilities -- you may be better off choosing to let clean finish. We are currently implementing a command that will nicely exit to prevent this from happening, but for the moment try to avoid Ctrl+C.

clean(vis='G55.7+3.4.calib.ms', imagename='G55.7+3.4.multiScale',
      imsize=1280, cell='8arcsec', multiscale=[0,6,10,30,60],
      interactive=False, niter=1000,  weighting='briggs',
      stokes='I', threshold='0.1mJy', usescratch=F, imagermode='csclean')
  • imagename='G55.7+3.4.multiScale': the root filename used for the various clean outputs. These include the final image (<imagename>.image), the relative sky sensitivity over the field (<imagename>.flux), the point-spread function (also known as the dirty beam; <imagename>.psf), the clean components (<imagename>.model), and the residual image (<imagename>.residual).
  • imsize=1280: the image size in number of pixels. Note that entering a single value results in a square image with sides of this value.
  • cell='8arcsec': the size of one pixel; again, entering a single value will result in a square pixel size.
  • multiscale=[0,6,10,30,60]: a set of scales on which to clean. Since these are in units of the pixel size, we are requesting scales of 0 (a point source), 48, 80, 240, and 480 arcseconds. Note that 16 arcminutes (960 arcseconds) roughly corresponds to the size of G55.7+3.4.
  • interactive=False: we will let clean use the entire field for placing model components. Alternatively, you could try using interactive=True, and create regions to constrain where components will be placed. However, this is a very complex field, and creating a region for every bit of diffuse emission as well as each point source can quickly become tedious.
  • niter=1000: this controls the number of iterations clean will do in the minor cycle.
  • weighting='briggs': use Briggs weighting with a robustness parameter of 0 (halfway between uniform and natural weighting).
  • usescratch=F: do not write the model visibilities to the model data column (only needed for self-calibration)
  • imagermode='csclean': use the Cotton-Schwab clean algorithm
artifacts around point source
  • stokes='I': since we have not done any polarization calibration, we only create a total-intensity image.
  • threshold='0.1mJy': threshold at which the cleaning process will halt; i.e. no clean components with a flux less than this value will be created. This is meant to avoid cleaning what is actually noise (and creating an image with an artificially low rms). It is advisable to set this equal to the expected rms, which can be estimated using the EVLA exposure calculator. However, in our case, this is a bit difficult to do, since we have lost a hard-to-estimate amount of bandwidth due to flagging, and there is also some residual RFI present. Therefore, we choose 0.1 mJy as a relatively conservative limit.

This is the fastest of the imaging techniques described here (it will probably take less than ten minutes to complete), but it's easy to see that there are artifacts in the resulting image. For example, use the viewer to explore the point sources near the edge of the field. Some have prominent arcs, as well as spots in a six-pointed pattern surrounding them. Next we will explore some more advanced imaging techniques to mitigate these artifacts.

Multi-scale, wide-field clean

The next clean algorithm we will employ is W-projection, which is a wide-field imaging technique that takes into account the non-coplanarity of the baselines as a function of distance from the phase center. For more details on the motivation for this correction, as well as the algorithm itself, see "The Noncoplanar Baselines Effect in Radio Interferometry: The W-Projection Algorithm".

clean(vis='G55.7+3.4.calib.ms', imagename='G55.7+3.4.MS.wProj',
      gridmode='widefield', imsize=1280, cell='8arcsec',
      wprojplanes=128, multiscale=[0,6,10,30,60], 
      interactive=False, niter=1000,  weighting='briggs',
      stokes='I', threshold='0.1mJy', usescratch=F, imagermode='csclean')
w-projection improvements
  • gridmode='widefield': use the W-projection algorithm.
  • wprojplanes=128: the number of W-projection planes to use for deconvolution; 128 is the minimum recommended number.

This will take longer than the previous imaging round (likely around 15 minutes); however, the resulting image has noticeably fewer artifacts. In particular, compare the same outlier source in the W-projected image with the multi-scale-only image: note that the swept-back arcs have disappeared. There are still some obvious imaging artifacts remaining, though.

Multi-scale, multi-frequency synthesis

Another consequence of simultaneously imaging the wide fractional bandwidths available with the EVLA is that the primary beam has substantial frequency-dependent variation over the observing band. If this is not accounted for, it will lead to imaging artifacts and compromise the achievable image rms.

If sources which are being imaged have intrinsically flat spectra, this will not be a problem. However, most astronomical objects are not flat-spectrum sources, and without any estimation of the intrinsic spectral properties, the fact that the primary beam is twice as large at 2 than at 1 GHz will have substantial consequences.

The Multi-Scale Multi-Frequency-Synthesis (MS-MFS) algorithm provides the ability to simultaneously image and fit for the intrinsic source spectrum. The spectrum is approximated using a polynomial in frequency, with the degree of the polynomial as a user-controlled parameter.

clean(vis='G55.7+3.4.calib.ms', imagename='G55.7+3.4.MS.MFS',
      imsize=1280, cell='8arcsec', mode='mfs', nterms=2,
      interactive=False, niter=1000,  weighting='briggs',
      stokes='I', threshold='0.1mJy', usescratch=F, imagermode='csclean')
artifacts with nterms=2
  • nterms=2:the number of Taylor terms to be used to model the frequency dependence of the sky emission. Note that the speed of the algorithm will depend on the value used here (more terms will be slower); of course, the image fidelity will improve with a larger number of terms (assuming the sources are sufficiently bright to be modeled more completely).

This will take a little while (likely around 30 minutes), so it would probably be a good time to have coffee or chat about EVLA data reduction with your neighbor at this point.

When clean is done <imagename>.image.tt0 will contain a total intensity image; <imagename>.image.alpha will contain an image of the spectral index in regions where there is sufficient signal-to-noise. For more information on the multi-frequency synthesis mode and its outputs, see the CASA cookbook. Inspect the brighter point sources in the field. You will notice that some of the artifacts which had been symmetric around the sources themselves are now gone; however, since we did not use W-projection this time, there are still strong features related to the non-coplanar baseline effects still apparent.

Multi-scale, multi-frequency, wide-field clean

Finally, we will combine the W-projection and MS-MFS algorithms to simultaneously account for both of the effects. Be forewarned -- these imaging runs will take a while, and it's best to start them running and then move on to other things. In testing, both of these runs (on the auto- and by-hand-flagged data) took around an hour.

First, we will image the autoflagged data. Using the same parameters for the individual-algorithm images above, but combined into a single clean run, we have:

clean(vis='G55.7+3.4.calib.ms', imagename='G55.7+3.4.MS.MFS.wProj',
      gridmode='widefield', imsize=1280, cell='8arcsec', mode='mfs',
      nterms=2, wprojplanes=128, multiscale=[0,6,10,30,60],  
      interactive=False, niter=1000,  weighting='briggs',
      stokes='I', threshold='0.1mJy', usescratch=F, imagermode='csclean')
artifacts with nterms=2, wide-field

Again, looking at the same outlier source, we can see that the major sources of error have been removed, although there are still some residual artifacts. One possible source of error is the time-dependent variation of the primary beam; another is the fact that we have only used nterms=2, which may not be sufficient to model the spectra of some of the point sources.

nterms=2, wide-field, auto-flagging
nterms=2, wide-field, by-hand flagging

We can compare the resulting image with one that was created from an MS that was flagged by hand, rather than with the automatic flagging routines. While it's clear that this is a superior image, the one that we have created with autoflagging is impressive, considering that the by-hand flagging took a number of weeks, and the autoflagging can be done in a matter of days (or hours, if one knows exactly what parameters to use).

Ultimately, it isn't too surprising that there was still some RFI present in our auto-flagged data, since we were able to see this with plotms. It's also possible that the auto-flagging overflagged some portions of the data, also leading to a reduction in the achievable image rms.


-- original: ?? --modifications: Lorant Sjouwerman (4.4.0, 2015/07/07)

Last checked on CASA Version 4.4.0.