VLA CASA Pipeline-CASA4.5.3

From CASA Guides
Jump to navigationJump to search


• With the start of Jansky VLA Full Operations (January 2013), we started a new operational model: – Deliver flagged and calibrated visibility data – You will self-calibrate and image visibility data to meet science goals, using resources at home institution or NRAO computing resources • Automated pipeline should run correctly on all “standard” Stokes I science SBs; “standard” means: – 128 MHz spws, but may work on other set-ups as well • Some constraints on strength of calibrators needed – Contains correctly labeled and complete scan intents • Current versions available: – “scripted” pipeline is a collection of python scripts that use CASA tasks wherever possible, but also uses toolkit calls; readable and easy to modify – CASA integrated pipeline is compatible with ALMA pipeline infrastructure, improved diagnostics in weblog, used as real-time pipeline since Sep 2015


• Real-time pipeline: – Minimal human intervention • Pipeline is run automatically on every science SB as it completes (not just “continuum”) – Pipeline output undergoes quality assurance checks by NRAO staff upon request; reports generated are archived as pipeline products • At your home institution: – Instructions for installation and operation of the VLA CASA Calibration Pipeline are available at https://science.nrao.edu/facilities/vla/data-processing/pipeline • Uses CASA 4.3.1, similar to current real-time pipeline • CASA 4.5.2 currently being validated (you are helping with this!) • Scripted pipelines for CASA versions through 4.5.0 also available – Provides more flexibility in how to use the pipeline, options suitable for spectral line datasets, mixed correlator set-ups, multi-band observations, etc. – Working to incorporate these into the CASA integrated pipeline

Pipeline Requirements

“Standard” Stokes I science SB means: – 128 MHz spws, but may work on other set-ups as well • Can work for narrower BWs, depends on the strength of the calibrators • Heuristics currently make some assumptions about the strength of the calibrators, in particular, the delay calibrator – Contains correctly labeled and complete scan intents • And also that the observation has been set up correctly! • Will the pipeline work for you? – The pipeline successfully completes on ~95% of all science SBs observed on the VLA; whether the output can be used for science depends on the science goal, and whether the observation was correctly set up • Pipeline includes Hanning smoothing, RFI flagging, and weight calculations that may not be appropriate for spectral line projects (but can modify scripted pipeline) • No polarization calibration (yet) but can use pipeline output as starting data for pol. cal. • Will probably work well for data taken since May 2012, may work for earlier EVLA data, likely that extra flagging may be needed in these cases


Calibrator strength: – Conservative limit on strength of BP and complex gain calibrators can be derived from requirement for initial gain calibration to work at high end of Q-band – Heuristic for delay calibration currently requires the SNR=3 limit on initial gain calibration per integration


• Correct observation set-up – Independent of whether you want to run the pipeline! – Remember: simple observing set-ups are always easier to calibrate – Do not skimp on calibration to spend more time on your target – you may end up not being able to calibrate the target data at all • Spending 3 minutes pointing could buy you more sensitivity than doubling the time on your target • Scan intents – The pipeline relies entirely on correct scan intents to be defined in each SB – In order for the pipeline to run successfully on an SB it must contain, at minimum, scans with the following intents: • A flux density calibrator scan that observes one of the primary calibrators (3C48, 3C138, 3C147, or 3C286) – this will also be used as the delay and bandpass calibrator if no bandpass or delay calibrator is defined • Complex gain calibrator scans

Overview of the Pipeline procedures

Assuming requirements are met, the pipeline: – Loads the data – Hanning smooths** – Retrieves information about the observing set-up from the data – Applies deterministic flags (online flags, shadowed data, end channels of subbands, etc.) – Identifies primary calibrators and loads models – Derives all prior calibrations (antenna position corrections, gain curves, atmospheric opacity, requantizer gains) – Iteratively determines initial delay and bandpass solutions, including running RFLAG (RFI flagging algorithm), and identifying other system (deformatter) problems – Derives initial gain solutions, does flux density bootstrapping and derives spectral index of all calibrators

    • May want to modify inputs and/or omit entirely for spectral line reductions


Heuristics (cont.): the pipeline: – Derives final delay, bandpass, and gain calibrations – Applies all calibrations to the MS – Runs RFLAG algorithm on all fields, including target** – Runs statwt to derive proper relative weights per antenna/spw**

    • May want to modify inputs and/or omit entirely for spectral line reductions

• Pipeline products and output – Flag and calibration tables – Calibrated MS (available for 15 days, not archived) – Logs, including weblog used by quality assurance (QA) staff and QA report if requested


Running the Pipeline

Assessing the Weblog

Pipeline Outputs

The real-time pipeline produces a calibrated and flagged MS for download (follow the directions in the email from the data analysts) – You may request a QA2 report from the data analysts – If you are happy with the pipeline calibration, then: • Do further flagging if necessary • Split out your target and image – If you have the SDM or uncalibrated MS and the calibration and flag tables, instructions for applying flags and calibration tables may be found at https://science.nrao.edu/facilities/vla/data-processing/pipeline • In some cases the pipeline and/or the MS may need to be modified – Download the SDM from the archive plus pipeline scripts – Follow the directions at above link • In some cases the pipeline heuristics may not be appropriate for your data (e.g., some L-band set-ups do not work well with the pipeline yet) – Reduce data by hand

Re-running the pipeline

Applying Pipeline Results

Known Issues and Workarounds

In general the pipeline does very well, but there are possible failure modes: – No flux density or gain calibrator intents defined, or flux density calibrator not one for which we have models • work around in scripted pipeline – Wrong scan intents • work around in scripted pipeline – Does not always identify deformatter problems (but does NOT usually have false positives – L-band may be an exception) • flag remaining bad spws – Calibrators are too weak for given spw bandwidth • heuristics have been developed and are currently being implemented

Spectral Line Data

Several steps in the real-time pipeline may not be appropriate for spectral line data: – Hanning smoothing (increases effective channel width) – Last run of RFLAG on target (may eliminate your line as interference!) – Statwt calculates rms based on scatter of channels per spw, per visibility; may want to run manually with channel selection turned on to eliminate use of channels containing line emission in calculating the rms • With the above modifications, the pipeline will work with spectral line data as long as the calibrators are strong enough

Polarization Calibration

Mixed Correlator Setups

With the new WIDAR capabilities it is common to observe both wide and narrow spws to obtain both continuum and spectral line data simultaneously, or multiple receiver bands – A single heuristic (e.g., gain calibration solution interval) for entire dataset may not be appropriate • Solution: – Run pipeline through application of deterministic flags, including Hanning smoothing if you are going to use it – Split the MS by spw and/or scans – Run pipeline on split MSs WITHOUT Hanning smoothing (you have already applied it, if you are going to use it) – Warning: output flagging statistics may not be correct

Special Cases

Scripted Pipeline

Using the NRAO cluster batch processing