Writing a CASA Task: Difference between revisions

From CASA Guides
Jump to navigationJump to search
Mkrauss (talk | contribs)
Mkrauss (talk | contribs)
 
(3 intermediate revisions by the same user not shown)
Line 222: Line 222:


== Simple example: mkmodelimage ==
== Simple example: mkmodelimage ==
Here is a very simple, yet useful, task to fill the MODEL_DATA column of a desired MS with a point-source model offset from phase center.  This is useful for self-calibrating, using a bright off-center point source, without relying on an image (which might not, for example, account for the source as point-like, even if it is). 
As an intermediate step, <tt>mkmodelimage</tt> creates a model image with the value of the requested pixel set to the desired level.  Running {{ft}} to fill the MODEL_DATA column is a separate step, and may be skipped if not desired. 
This task uses both toolkit calls as well as a call to another task to accomplish its goal.
The inputs are:
<pre>
#  mkmodelimage :: Convert an existing image into a point source model image, then run ft to fill the model data column in the desired MS.
imagein            =        ''        #  Input image name
imageout            =        ''        #  Output model image name
xcoord              =          0        #  x pixel coordinate
ycoord              =          0        #  y pixel coordinate
fluxval            =        1.0        #  Point source flux (Jy)
doft                =      True        #  Run ft to set MODEL_DATA in visibility file?
    vis            =        ''        #  Name of input visibility file
async              =      False        #  If true the taskname must be started using mkmodelimage(...)
</pre>
You can access the Python task file here: [[File:Task_mkmodelimage.py]], and the XML file here: [[File:Mkmodelimage.xml]].  Note that to build these, you will need to rename the files with lowercase first letters (the wiki on which CASA Guides are based automatically capitalizes them).


== Complex example: mkpipeline ==
== Complex example: mkpipeline ==
This task is a basic pipeline for processing EVLA continuum data.  The task itself comprises over 2500 lines of code, and has a number of different options:
<pre>
#  mkpipeline :: Task for checking and processing EVLA continuum data.
mode                = 'initproc'        #  Processing mode (initproc, reproc)
    sdmname        =        ''        #  Name of input SDM directory for processing
    rootname      =        ''        #  Root name for output files
    band          =        ''        #  Band name (L, S, C, X, Ku, K, Ka, Q)
    dummy          =      False        #  Are the first two scans within this band dummies, to be ignored?
    checkzeros    =      True        #  Check data for presence of zeros?
    timeave        =      True        #  Average data in time?
flaglist            =        ''        #  List of flagging commands (flagcmd formatted)
doplot              =      True        #  Plot raw data and calibration products?
calimage            =      True        #  Image the calibration sources?
email              =        ''        #  Email address to send notification when pipeline has completed
webdir              =        ''        #  Web space to which output HTML pages and plots will be copied
http                =        ''        #  Root http address where plots will appear
async              =      False        #  If true the taskname must be started using mkpipeline(...)
</pre>
The help file provides a summary of what it does.  In this case, having a nice framework for the documentation (which is provided, since it's a task) is very useful. 
Among other things, this task can provide good examples of the following:
* Importing other tasks
* Importing and using useful Python modules, such as numpy
* A more complex example of the XML file
* Extensive use of the CASA log; redirecting and labeling its output
* Defining and calling member functions
* Writing web pages and sending email from within a task
* Querying CASA tables for needed information
* Determining image parameters in an automated fashion
* Automated plot generation
'''Note that this is not an official EVLA pipeline.'''  However, it may be useful in quick processing for first results, and also as an example for developing your own pipeline processing tasks.
You can find the Python task file here: [[File:Task_mkpipeline.py]], and the XML parameter file here: [[File:Mkpipeline.xml]].  Again, be sure to change the first letters to lowercase before running buildmytasks; they were set to uppercase by the wiki page uploader.


== Exploring the built-in CASA tasks ==
== Exploring the built-in CASA tasks ==

Latest revision as of 00:53, 17 January 2012

Overview

In CASA, it is relatively simple to create your own tasks -- these can be as simple or as complex as you desire, and with the ease of scripting in Python, the possibilities are almost endless. You can also configure your CASA setup to automatically load these tasks on startup, so that they're always available to you.

Writing a task isn't all that much more work than writing a script, and it gives a nice interface through which to interact (along with the same functionality).

This tutorial will step you through the process of creating a task, and also point you to some examples of other tasks (some within CASA itself, and others written by non-developers). We hope you find this useful in creating your own!

The basics of writing and running a task

A task comprises an XML file, in which the parameter interface and help file are coded, and a Python file, in which the main body of code lives. They must follow a particular naming scheme, so that the program which creates all the task files for CASA to read (called buildmytasks) knows what it's looking for. So, if you want to write a task called newtask, you will need to have:

newtask.xml
task_newtask.py

The XML file

Getting the XML file right can sometimes be a bit tricky, since the parser is very particular. Here is a very simple example:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" ?>

<casaxml xmlns="http://casa.nrao.edu/schema/psetTypes.html"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://casa.nrao.edu/schema/casa.xsd
file:///opt/casa/code/xmlcasa/xml/casa.xsd">

<task type="function" name="newtask" category="editing">

  <shortdescription>Does nothing in particular.</shortdescription>
      
  <description>As I said, this task doesn't really do much.</description>

  <input>

    <param type="string" name="imagein" mustexist="true">
      <description>Input image name</description>
      <value></value>
    </param>

  <constraints>
  </constraints>

  </input>

  <returns type="void"/>

  <example>

   As I said, this task doesn't do much.

   Keyword arguments:

	imagein -- name of input image file

  </example>

</task>

</casaxml>

Note that all blocks must be closed by a matching "/" tag.

  • <shortdescription>: appears in the 'taskhelp' listing
  • <description>: I'm really not sure where this goes. If anyone can figure it out, let me know!
  • <input>: this is where the parameters which appear when you type "inp newtask" go
    • <param>: each param/description/value set defines a particular parameter.
      • In this case, we have defined that the parameter input needs to be of type "string" (other options include "int", for integer; "bool", for boolean True/False; "double" for double-precision floating point, or "any" for anything)
      • Setting mustexist="true" means that CASA will check that this file actually exists before proceeding (and exit with an error if it doesn't)
      • <description>: this is the information that follows the hash (#) in the parameter list
      • <value>: a default value to set for this parameter
  • <constraints>: this is where you would enter any constraints you want for the input parameters. It's not necessary to add any, but can be a nice way to control sub-parameters.
  • <example>: this is where the text for "help newtask" should be placed.

The Python file

This is where the "meat" of the task goes. In addition to the actual code, it also requires a few elements. Again, the simplest example:

from taskinit import *

def newtask(myinput = None):

    casalog.origin('newtask')
    casalog.post('Remember to '+myinput)

Building the task

Once you have the XML and Python files, run buildmytasks to compile them into a form that can be imported into CASA. This can be done on the command line (by simply typing "buildmytasks", or within a CASA session:

# In CASA
os.system('buildmytasks')

After buildmytasks has completed, you will see that there are a number of new files, e.g. "newtask.py-e", "newtask_cli.pyc", etc. Also, there is now a file called "mytasks.py", which is the meta-file for all the tasks you might have in this directory. This is what you import into CASA to include these tasks:

# In CASA
execfile('/<path_to_task_directory>/mytasks.py')

Note that if you rebuild a task after having already imported it into CASA, you will need to restart CASA and reimport for the changes to be incorporated.

Running the task

Now that the task has been imported into CASA, it can be run like any other task. For example,

# In CASA
inp newtask
help newtask
myinput = '"Stay hungry. Stay foolish." - Steve Jobs'
go

That's it -- a new task!

Using the CASA toolkit

At their most basic level, CASA tasks are Python wrappers for the deeper functionality available via the CASA "toolkit". (This is not to say the wrappers are always simple -- see #Exploring the built-in CASA tasks for a look at the Python code in the built-in tasks.)

Although it's possible for a user-created task to call other tasks, ideally, it calls the toolkit for functionality. This can help optimize performance, as well as take advantage of options that might not exist within the framework of currently existing tasks.

A good place to learn about the available toolkit functions is in the CASA Toolkit Reference Manual. Note that the toolkit is subdivided into different "tools", and in CASA, these are called using an abbreviation of the tool along with the name of the function. For example, to call a function from the image tool, type

# In CASA
ia.<tool_function_name>

Typing "ia." then hitting <TAB> will display all the possibilities.

The most complete help files can be found in the CASA Toolkit Reference Manual at the moment; we're working to get all this information into the inline CASA help as well. Note that the functions are not listed in alphabetical order within the tools. This means you might need to search a bit to find what you're looking for. Also, if you find documentation that's out of date or missing, please let us know via the NRAO Helpdesk so that we can fix it!

Here is a table of some common tool names, and their abbreviations within CASA:

Toolkit tool CASA abbreviation
image ia
calibrater cb
componentlist cl
coordsys cs
flagger fg
imagepol po
imager im
measures me
ms ms
msplot mp
quanta qa
regionmanager rg
simulator sm
table tb
vpmanager vp

While CASA tasks are written in Python, the tools and other lower-level functionality is written in C++. This makes it substantially more difficult to create and incorporate your own tools in CASA, and we won't go into it here.

A typical set of tool calls will open the desired table, operate on it, then close it again. For example, to use the toolkit to perform flux scaling, the commands would be like this example:

# In CASA
cb.open(ngc5921.ms)  
cb.selectvis(field=1331*,1445*)  
cb.setsolve(type=G, table=gcal, t=inf)  
cb.solve()  
cb.fluxscale(tablein=gcal, tableout=flxcal,  
             reference=1331*, transfer=1445*)
cb.close()

Unlike calling a task within a task, which requires some special importation at the beginning of the task (see #Calling CASA tasks), calling a tool does not require anything special beyond what is shown in #The Python file.

Calling CASA tasks

Although it is better to only use toolkit calls within a task, there will be times when it's much easier to make a call to another task instead. In this case, the task to be called should be imported into the task being written in the top block of code thus:

from taskinit import *
from flagdata_cli import flagdata_cli as flagdata

In this case, the flagdata task is imported; since this is a particularly complex task, it might make sense to import it rather than relying on the toolkit for its functionality.

Simple example: mkmodelimage

Here is a very simple, yet useful, task to fill the MODEL_DATA column of a desired MS with a point-source model offset from phase center. This is useful for self-calibrating, using a bright off-center point source, without relying on an image (which might not, for example, account for the source as point-like, even if it is).

As an intermediate step, mkmodelimage creates a model image with the value of the requested pixel set to the desired level. Running ft to fill the MODEL_DATA column is a separate step, and may be skipped if not desired.

This task uses both toolkit calls as well as a call to another task to accomplish its goal.

The inputs are:

#  mkmodelimage :: Convert an existing image into a point source model image, then run ft to fill the model data column in the desired MS.
imagein             =         ''        #  Input image name
imageout            =         ''        #  Output model image name
xcoord              =          0        #  x pixel coordinate
ycoord              =          0        #  y pixel coordinate
fluxval             =        1.0        #  Point source flux (Jy)
doft                =       True        #  Run ft to set MODEL_DATA in visibility file?
     vis            =         ''        #  Name of input visibility file

async               =      False        #  If true the taskname must be started using mkmodelimage(...)

You can access the Python task file here: File:Task mkmodelimage.py, and the XML file here: File:Mkmodelimage.xml. Note that to build these, you will need to rename the files with lowercase first letters (the wiki on which CASA Guides are based automatically capitalizes them).

Complex example: mkpipeline

This task is a basic pipeline for processing EVLA continuum data. The task itself comprises over 2500 lines of code, and has a number of different options:

#  mkpipeline :: Task for checking and processing EVLA continuum data.
mode                = 'initproc'        #  Processing mode (initproc, reproc)
     sdmname        =         ''        #  Name of input SDM directory for processing
     rootname       =         ''        #  Root name for output files
     band           =         ''        #  Band name (L, S, C, X, Ku, K, Ka, Q)
     dummy          =      False        #  Are the first two scans within this band dummies, to be ignored?
     checkzeros     =       True        #  Check data for presence of zeros?
     timeave        =       True        #  Average data in time?

flaglist            =         ''        #  List of flagging commands (flagcmd formatted)
doplot              =       True        #  Plot raw data and calibration products?
calimage            =       True        #  Image the calibration sources?
email               =         ''        #  Email address to send notification when pipeline has completed
webdir              =         ''        #  Web space to which output HTML pages and plots will be copied
http                =         ''        #  Root http address where plots will appear
async               =      False        #  If true the taskname must be started using mkpipeline(...)

The help file provides a summary of what it does. In this case, having a nice framework for the documentation (which is provided, since it's a task) is very useful.

Among other things, this task can provide good examples of the following:

  • Importing other tasks
  • Importing and using useful Python modules, such as numpy
  • A more complex example of the XML file
  • Extensive use of the CASA log; redirecting and labeling its output
  • Defining and calling member functions
  • Writing web pages and sending email from within a task
  • Querying CASA tables for needed information
  • Determining image parameters in an automated fashion
  • Automated plot generation

Note that this is not an official EVLA pipeline. However, it may be useful in quick processing for first results, and also as an example for developing your own pipeline processing tasks.

You can find the Python task file here: File:Task mkpipeline.py, and the XML parameter file here: File:Mkpipeline.xml. Again, be sure to change the first letters to lowercase before running buildmytasks; they were set to uppercase by the wiki page uploader.

Exploring the built-in CASA tasks

One great way to learn to tricks (and understand CASA more completely) is to look at the tasks within CASA. Finding the directories where the XML and Python code live can be a little tricky, since it varies for different operating systems. Try going to the directory where the CASA installation lives:

# In CASA
os.environ.get('CASAPATH').split()[0]

Then, in a terminal in this directory, type:

find . -name task_clean.py
find . -name clean.xml

This should point you to the directories where the relevant files live for all CASA tasks. Have fun exploring!

Automatically loading tasks on startup

So now you've written a task, and want to always have it available to you when you start CASA. There's a file that CASA will source every time on startup, which resides in your home directory in the hidden subdirectory .casa. Go to this directory, and create a file called init.py. In this file, put the line:

execfile('/<path_to_task_directory>/mytasks.py')

(In other words, the same command you used to load the task(s) into CASA before.) Note that this can sometimes cause problems, in case you update your version of CASA and it's incompatible with some tasks you have. However, it's easy enough to comment out this line before starting CASA, if you desire.