Data flagging with plotms: Difference between revisions

From CASA Guides
Jump to navigationJump to search
Jgallimo (talk | contribs)
Jgallimo (talk | contribs)
Line 294: Line 294:
=== Manual Flagging ===
=== Manual Flagging ===


Antenna #10 appears to be producing consistently wonky data. At this point, it's worth getting out of [[casaplotms]] and returning to CASA to perform manual editing using [[flagdata]]. We'll throw caution to the wind and assume that antenna #10 was acting up during the entire observation.
On the strength of inspection of two sources, antenna #10 appears to be producing consistently wonky data. At this point, it's worth getting out of [[casaplotms]] and returning to CASA to perform manual editing using [[flagdata]]. We'll throw caution to the wind and assume that antenna #10 was acting up during the entire observation.


<source lang="python">
<source lang="python">

Revision as of 21:54, 1 December 2009

Casaplotms is (currently) a standalone tool to inspect and edit measurement sets. This tutorial demonstrates how to use casaplotms to edit a multisource continuum data set: VLA program AU079, which consists of L-band (20 cm) continuum observations of galaxies and calibrator sources. It is the same data set used in the Imaging Flanking Fields tutorial, as well as the Data flagging with viewer tutorial.

Loading the Measurement Set into Casaplotms

As described in the Imaging Flanking Fields tutorial, the data may be loaded into CASA using the importvla command. The following commands import the data into the measurement set au079.ms and exit CASA to the command line.

# import the glob command for filename searching with wildcards
from glob import glob

# Define the list of files for reading. Use glob to perform wildcard matching with VLA archive filenames.
fileList = glob('AU079_*.xp?')

importvla(archivefiles=fileList,vis='au079.ms')
exit()

Now start up casaplotms from the command line.

# in bash
casaplotms

This command brings up the PlotMS window, shown with annotations at right. The window comprises three panels: the control panel (outlined in blue), the graphics panel (green), and the tools panel (red). The control panel controls the selection of data for display and the graphing parameters (axis selection, axis limits, and so on). The graphics panel is the display panel for two-dimensional (x, y) projections of the data. The tools panel provides commands to interact with the graphics panel.

The control panel further breaks down into a series of tabs, annotated as Top Tabs and Side Tabs, which contain related plotting and editing control parameters. This tutorial employs only the Plots tab among the Top Tabs and the following Side Tabs.

  • MS, which controls the selection of the measurement set proper and the selection of data within the measurement set.
  • Axes, which controls the selection of data and plotting parameters for the (x, y) graph.
  • Plot, which affects the style of plotting symbols, whether or not flagged data points are shown, and axis labels.
  • Flagging, which controls how flagging commands are extended (as of 1 Dec 2009, these flagging extensions are very limited but will likely improve as casaplotms continues development).

In this tutorial, interactive commands in the PlotMS window will be summarized as (Tab)Command,, where (Tab) represents the Side Tab where the command is found, and Command is the appropriate GUI interaction (button press, text field, checkbox, etc.).

Use the (MS)Browse button, or enter the full pathname, to navigate to and select the measurement set (here, au079.ms).

Click to enlarge


Identifying Bad Data by Discrepant Amplitudes

Have a first look at the data by hitting the (MS)Plot button. By default, the axes will be visibility amplitude vs. time. The y-axis amplitudes aren't yet calibrated, but for the sake of argument we'll refer to them as flux densities in Jy.

The x-axis labeling is a little garbled in this development version of the software, but straightaway there appear some wildly discrepant data. For a typical decimeter-wave continuum data set, sources and calibrators are expected to show visibility amplitudes of a few Jy or less; visibilities with amplitudes in the 100s of Jy range are likely bogus. Here's how to flag them.

Click to enlarge

There's a simple pattern to flagging in casaplotms.

  • Highlight the data to be flagged using the Mark Regions tool.
  • Flag the data in the highlighted region using the Flag tool.

The figure at right shows a highlighted region selected using the Mark Regions tool. After flagging, those data will be removed from the display unless (Plot)Flagged Points Symbol is set.

Click to enlarge

The figure at right shows a close-up of the data that remain. The y-axis scale was reduced to the range (0, 100 Jy) by using the (Axes)Range controls.

Notice that you can set more than one region with the Mark Regions tool before flagging.

Click to enlarge

Tip: The automatic scaling of the data axes are cached and so are unaffected by flagging. To rescale (semi-) automatically, change the 
(Axes)X Axis to some other arbitrary projection (say, Scan), (Axes)Plot, and then reset (Axes)X Axis to its original state
(say, Time).

After zapping those obviously high visibilities, things become a little more challenging. The figure at right shows a close up of the remaining visibilities between 0 and 1 Jy flux density.

There probably remain bad data there, but it's hard to tell on the crowded plot. At this point it's better to examine individual sources within this multisource measurement set.

Click to enlarge

Examining Individual Sources within a Measurement Set

Use the following settings to look specifically at the first source of the measurement set.

  • (MS)Field = 0
  • (Axes)Y Axis Range = Automatic
  • (Plot)Unflagged Plots Symbol to Custom, 4 px (make the data points larger for clarity)
  • (Axes)Plot

Notice that the Plot button is available from more than one tab.

The figure at right shows the result. The bad data have been highlighted using the Mark Regions tool.

Click to enlarge


The figure at right shows the remaining data for the first source (field = 0) after flagging the obviously discrepant points. Things look deceptively OK, but in fact there remains bad data from one antenna. The antenna contributed poor data for the entirety of the observation of this source, and, since the problem is not isolated in time, it is difficult to see the problem.

Click to enlarge


Here are the same data reprojected onto baseline separations, (Axes)X Axis = UVDist_L (projected baseline separations in units of the observing wavelength). The misbehaving antenna shows up as spikes in these snapshot observations, because each antenna pair with that antenna corresponds to only a narrow range of baseline separations. (A longer observation would produce broader spikes, because the projected baseline separations would vary with the rotation of the earth under the source.)

The idea now would be to highlight a subset of the discrepant data as shown in the figure and extend the flags to the common antenna of these baselines. This option is not presently available in the current development build of casaplotms, but keep an eye on (Flagging)Extend flags = Antenna.

Click to enlarge


Antenna-Based Flagging

Clearly we have a bad antenna on this field, and the question remains how to deal with it within casaplotms. One option we have is to plot baselines to one antenna at a time, using (MS)antenna, but this approach would get tedious quickly. Instead, use the Locate tool to list the properties of the data within the highlighted region. Here's a subset of the listing that results and is reported to the terminal running casaplotms.

PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:25:55.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=6 X=131224 Y=0.0534296 (27877/21/577)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:05.0 BL=10-19 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=90195.5 Y=0.05591 (28759/22/159)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:05.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=131224 Y=0.0503287 (29179/22/579)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:15.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=6 X=131223 Y=0.0568316 (30477/23/577)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:15.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=131223 Y=0.0611954 (30479/23/579)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:35.0 BL=10-19 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=90205.1 Y=0.052479 (32659/25/159)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:35.0 BL=10-11 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=147929 Y=0.0619749 (33151/25/651)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:35.0 BL=8-10 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=115989 Y=0.0547681 (33219/25/719)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:45.0 BL=10-19 Spw=0 Chan=0 Freq=1.4649 Corr=7 X=90208.3 Y=0.0577008 (33958/26/158)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:45.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=6 X=131221 Y=0.0504603 (34377/26/577)
PlotMS::locate+ Scan=1 Field=0 Time=1999/08/30/05:26:55.0 BL=0-10 Spw=0 Chan=0 Freq=1.4649 Corr=8 X=131220 Y=0.0518869 (35679/27/579)

The antenna pairs are listed as BL= (baseline = ). The common, or culprit, antenna for the highlighted data is #10. Plot just the data to that antenna by selecting (MS)antenna = 10.

Considering that the healthier data fall below 0.2 Jy, and all of the antenna #10 baselines commonly produce amplitudes in excess of 0.2 Jy, it's safe to conclude that antenna #10 is useless for this source. Here's how to zap those data.

  • As described above, use the (MS) tab to select the field and antenna (here, field = 0 and antenna = 10).
  • (MS)Plot
  • Use the Mark Regions tool to highlight all of the data on the screen.
  • Apply the Flag tool.

All of the data in the graphics panel should have disappeared!

Click to enlarge

Clear the (MS)antenna entry to plot again data from all baselines. The result is shown at right. The data look clean! Or, for the less sanguine, there remain no obviously bad data.

Time to proceed to the next field, (MS)field = 1.

Click to enlarge

Cross-Pol Data of Bright Sources

(MS)field = 1 is shown at right. This is a bright calibrator source. The data values near 0 Jy are cross-pol data rather than discrepant data; these cross-pol data won't be as obvious in the plots for fainter sources.

We see some of the familiar spiking that we saw in field = 0. Use again the locate tool on the highlighted selection. Here's a trimmed snippet of the output.

PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:35.0 BL=5-10 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:35.0 BL=10-13 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:35.0 BL=2-10 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:45.0 BL=10-18 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:45.0 BL=6-10 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:45.0 BL=3-10 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:45.0 BL=10-16 
PlotMS::locate+ Scan=143 Field=1 Time=1999/08/30/05:16:45.0 BL=10-24 

Notice that antenna #10 reappears as a culprit.

Click to enlarge

Manual Flagging

On the strength of inspection of two sources, antenna #10 appears to be producing consistently wonky data. At this point, it's worth getting out of casaplotms and returning to CASA to perform manual editing using flagdata. We'll throw caution to the wind and assume that antenna #10 was acting up during the entire observation.

# in casapy
default("flagdata")
vis = "au079.ms"
antenna = "10"
flagdata()

Now, load the data back into casaplotms, and set things up to plot (MS)field = 1 again; see figure at right. The data look much cleaner, with a few wonky points that can be easily cleaned up with interactive flagging.


Click to enlarge



--Jack Gallimore 14:38, 1 December 2009 (UTC)