PythonBasics: Difference between revisions
Line 340: | Line 340: | ||
print len(bands) | print len(bands) | ||
</source> | </source> | ||
=== Simple Slicing === | |||
You "slice" a list, accessing parts of the data using square brackets: | You "slice" a list, accessing parts of the data using square brackets: | ||
Line 389: | Line 391: | ||
print bands[2:] | print bands[2:] | ||
</source> | </source> | ||
=== Assignment === | |||
These same tricks can be used for assignment. We'll recreate bands and then replace the last element: | These same tricks can be used for assignment. We'll recreate bands and then replace the last element: | ||
Line 405: | Line 409: | ||
</source> | </source> | ||
Notice that the | Notice that the sizes of the left and right side of the assignment need to match. So this will not work | ||
<source lang="python"> | |||
bands[0:3] = 3.0 | bands[0:3] = 3.0 | ||
</source> | |||
In addition to replacing elements you can add to the list like so: | |||
<source lang="python"> | |||
bands.append("Ka") | bands.append("Ka") | ||
</source> | |||
=== Exploring List Functionality (Back to the Shell) === | |||
Lists have several neat abilities in addition to "append." If you don't remember them offhand you can easily access them in the interpretter. Just type in "bands." and hit <tab>. ipython will offer you the the available autocompletions. Leave aside all those __ items and look at the rest. We see "append," "count," ... "sort." | |||
<source lang="python"> | |||
help(bands.count) | help(bands.count) | ||
</source> | |||
so bands.count() checks the number of occurences of a value. Try it out: | |||
<source lang="python"> | |||
bands.count("Ka") | bands.count("Ka") | ||
</source> | |||
# play with the other functions | # play with the other functions | ||
Line 433: | Line 439: | ||
# ... get back the most recent addition and simultaneously remove it | # ... get back the most recent addition and simultaneously remove it | ||
# from the list | # from the list | ||
<source lang="python"> | |||
bands.pop() | bands.pop() | ||
</source> | |||
# ... find the first "Ka" | # ... find the first "Ka" | ||
<source lang="python"> | |||
bands.index("Ka") | bands.index("Ka") | ||
bands.index("Q") | </source> | ||
<source lang="python"> | |||
bands.index("Q") | |||
</source> | |||
# ... put K before Ka | # ... put K before Ka | ||
<source lang="python"> | |||
bands.insert(bands.index("Ka"),"K") | bands.insert(bands.index("Ka"),"K") | ||
bands | bands | ||
</source> | |||
# You can explore the other abilities, removal, sorting, reversing | # You can explore the other abilities, removal, sorting, reversing | ||
=== A First Look at Iteration === | |||
We'll see much more of this, but one of the most important aspects of a list is that you can easily iterate over it. This simple syntax will get each element in bands, in order of their syntax, print its value, and tell you what type it is. | |||
for element in bands: | for element in bands: | ||
print element, type(element) | print element, type(element) | ||
=== Unpacking === | |||
Again we'll see more of this but lists and other collections like tuples (again seen elsewhere) have the ability to "unpack." This means that they can be compactly assigned to a series of variables like so: | |||
bands = ["X","Y","Z"] | bands = ["X","Y","Z"] | ||
Line 471: | Line 478: | ||
print b | print b | ||
print c | print c | ||
== Data Collections: Dictionaries == | == Data Collections: Dictionaries == |
Revision as of 20:04, 12 October 2011
We are porting, updating, and editing this [1]
Preface
This guide is intended as a practical introduction to python for astronomers using CASA. Along with the accompanying pages, we hope that this introduction gets you comfortable enough to try taking advantage of CASA's python shell to script your data reduction or integrate your analysis, observation planning, and reduction. Despite the CASA focus, we aim to make this guide useful to any astronomer looking for a first dive into python. Given that similar material exists across the web we will not attempt a comprehensive introduction, just a tour of most of the basic python functionality needed to dive into CASA and python data analysis. If this guide whets your appetite then we recommend digging into the pages linked off PythonOverview.
Environment
We could write a lot on setting up python, a nice shell, installing key packages, etc. Because this is a CASA guide, we can short circuit some of this. CASA installs with it's own nice iPython shell and a core set of third-party packages (numpy, scipy, pylab, matplotlib). We will assume that you have downloaded and installed CASA from here and return to the issue of additional third-party software down the road.
Once you have CASA installed you can start a new session from your shell prompt by just typing
casapy
Depending on your computing environment you likely also have python and perhaps iPython installed, but the versions of these (especially the pre-installed version on the Mac) can vary widely. If you are interested in a non-CASA distribution that folds in a similar (in fact more extensive) suite of packages you might look at the free academic version here (no promises). We focus our discussion on the version of python that comes with CASA (2.6) and the examples will assume that you are working inside of a casapy shell.
Pasting Code
We will fairly rapidly reach the point where we want to cut and paste somewhat complex blocks of code. Python's use of indentation to implement code structure can make for awkward interactions with the shell at times. Even when it goes well after cutting and pasting a 'for' loop you often have to press return twice to execute. In very complex cases, you may end up needing to paste a line at a time.
Fortunately, the iPython shell offers a very nice way around these challenges. You can type
cpaste
to initiate a 'code paste'. You will then be able to paste code directly into the shell and the code will appear exactly as you copied it. When you have pasted all of your code got to a new line and type "--". This ends the code paste and the pasted code should then execute. Use this (really!), it will save you a large amount of pasting-frustration.
Getting Help, Exploring Objects, the Shell
CASA's shell (and iPython shells in general) has a lot of useful functionality. For example, CASA knows some basic operating system commands like "ls", "cat", etc. CASA only knows a few of these but you can issue any commands to your Unix/Linux shell directly by prefacing those commands with an exclamation point. So this works:
ls
and so does this:
!ls -lh
this does not (because CASA does not know "df"):
df -h
but this does:
!df -h
The shell has many other nice features. For example, if you type the name of a variable, it will simply print the variable value.
You can readily get help on most objects via "help thisorthat," for example
help list
Along with "help," the other major exploration capability in the shell is tab-completion. This will let you explore what variables are defined in your shell. Even better, you can take any object, append a ".", hit tab, and the shell will show you list of associated methods.
Try this now by typing "list." and hitting tab. The shell will print all the things that a list can do. In addition to a bunch of system-defined names that begin and end with "__" (e.g., see the equality operator "__eq__") you can see that list has a set of methods "append", "count", "extend", "index", ... "sort".
If you saw that list.append exists and wanted to learn a bit about it you could type
help list.append
We're getting a bit ahead of ourselves but try to bear these exploratory capabilities as you look through the guide. They are a great way to poke around and find new functionality.
One very important thing to bear in mind as you begin more advanced programming is that the shell specific capabilities are not available outside the shell. In practice this means that you cannot, for example, print the name of a variable by just listing the variable inside a program.
Simple Variables
Let's begin by creating an manipulating some basic variables. At the python shell prompt create your first variable by typing
my_first_var = 1
That's it, you've created your first variable. Note that you didn't have to specify the data type, python works this out from context. To see the value of your variable type
print my_first_var
or (if and only if you are at the shell) you can see the value by just typing the variable name
my_first_var
(read: use 'print' when programming).
You can see the type of the the variable by typing
print type(my_first_var)
Now try the same thing with some other data types.
# Make and print a string
a_string = "A String"
print a_string, type(a_string) # a string
Note that this first line does nothing. Python's comment string is "#" and it ignores everything in a line that comes after this character.
Some similar examples for floating point numbers:
type(3.14159) # a float
a_float = 1e-10
print type(a_float) # scientific notation
Notice the in-line comment and the on-the-fly creation of a variable in the first line.
Finally, python has a 'boolean' type that is either 'True' or 'False' (shorthand 'T' or 'F' in CASA, though not iPython in general).
type(True) # a boolean
boo = False
print boo, type(boo)
You can chain variable creation:
x = y = 1
print x, y
x = y = v = u = t = 0
print u, t
Basic Math
We have seen how to create variables. Now we will manipulate them with basic math operations.
Begin by creating a new variable with value 1.
x=1
and print the value after we add 2.
print x+2
We can assign the output of the operation to a new variable, which now exists in addition to x
y = x + 2
print y
z = x - 3
print z
Implicit operations work, e.g., "x += 3" will add three to the value of x
x = 1
print x
x += 3
print x
x *= 10
print x
x /= 20
print x
Exponents are done with "**", so for example
x = 2
y = x**2
print x, y
or to take the square root
x = 4
print x**0.5
Ints and Floats
We saw that python assigns a new variable the appropriate type based on context. Overall this is a nice feature, it lets us quickly develop scripts and programs without a lot of traditional programming overhead. However, there are a few rules that we need to keep in mind.
The first point to see is that dividing two integers returns an integer that is the "floor" (value rounded to the nearest integer closer to zero) of the result. For example, declare a new integer x and divide it by 20:
x = 1
print x / 20
or try these
print 19 / 20
print -1 / 20
print -19 / 20
That's the division of two integers. If we divide a float and an integer, on the other hand, we get a float:
x = 1
print type(x)
print x / 1.5
print type(x/1.5)
the operation implicitly casts the result as a float.
Explicit casts are okay and a reasonable way around this. Again, int() floors. For example:
x = 0.6
int_version = int(x)
int_version # remember this prints in the shell
Similarly, float() casts as a float:
float_version = float(int_version)
float_version
The order does matter. Notice the difference between these two cases:
print float(19)/20
print float(19/20)
The takeaway for the average CASA user here is: be careful tossing integers around in math because things may unexpectedly round on you. When in doubt, you should probably make things floats.
Booleans
Python includes booleans as a basic data type. These are variables that have two possible values: True or False (for CASA the shorthands "T" and "F" work as well). You can declare booleans efficiently:
b = True
print b
b = False
print b
Booleans are particularly interesting as the output of many operators. For example:
b = 1 == 1
print b
or
c = 1 != 1
print c
Deleting Variables
Python variables at the shell level will hang around forever until you declare a variable with the same name or actively get rid of them (by contrast variables declared inside of a procedure will go away when the procedure ends). You can get rid of a variable (module, object, whatever) like so:
x = 1
del x
print x
and just like that x is gone. Be careful with this.
Checking Whether a Variable Exists
We just gave a practical example of using "print" to check variable existence. But this sort of catastrophic approach isn't ideal. A more robust way to check if a variable exists is:
try:
the_answer
except NameError:
print "Does not exist!"
The try statement accepts an "else":
the_answer = 42.
try:
the_answer
except NameError:
print "Does not exist!"
else:
print "We're cool."
Now try the same thing after deleting your variable:
del the_answer
try:
the_answer
except NameError:
print "Does not exist!"
else:
print "We're cool."
This kind of approach gives you the ability to write non-explosive scripts. There are other ways to approach this. If you are interested you may want to read up further on the locals and globals python dictionaries.
Data Collections: Lists
Python has several built in types of data collections. Here we'll look at "lists", which are loose collections of potentially mixed types of data. You can declare a list by placing a comma-delimited list of variables inside a set of square brackets. For example, let's create a list called "bands":
bands = ['L', 'S', 'C', 'X', 12]
One potentially surprising feature of python lists is that they freely mix and match data types. Notice that "bands" mixes strings and integers. You can print the type and number of elements of the list via
print type(bands)
print len(bands)
Simple Slicing
You "slice" a list, accessing parts of the data using square brackets:
bands = ['L', 'S', 'C', 'X', 12]
print bands[0]
print bands[4]
Notice that python is "zero-indexed" which means that for a 5-element list like "bands" the indices run from 0 (first) to 4 (last). As a result this will not work
print bands[5])
because there is no element 5.
Also notice that python allows you to use negative indices to index "backwards" from the last element of the array
print bands[-1]
print bands[-2]
You can pick out a range of indices using a ":" and the following syntax
print bands[1:3]
Notice that it prints elements 1 and 2 but not element 3 of the list. So "n:m" retrieves elements n through m-1.
Similarly "0:3" will retrieve elements 0, 1, and 2. For example
print bands[1:3]
A free ":" retrieves everything along the axis:
print bands[:]
while ":n" implicitly takes "0" as the initial index and "n:" runs n through the last index. For example:
print bands[2:]
Assignment
These same tricks can be used for assignment. We'll recreate bands and then replace the last element:
bands = ['L', 'S', 'C', 'X', 12]
bands
bands[4] = "Ku"
print bands
or use the more complex slicing to replace the first three elements:
bands[0:3] = [1.4,3.0,6.0]
Notice that the sizes of the left and right side of the assignment need to match. So this will not work
bands[0:3] = 3.0
In addition to replacing elements you can add to the list like so:
bands.append("Ka")
Exploring List Functionality (Back to the Shell)
Lists have several neat abilities in addition to "append." If you don't remember them offhand you can easily access them in the interpretter. Just type in "bands." and hit <tab>. ipython will offer you the the available autocompletions. Leave aside all those __ items and look at the rest. We see "append," "count," ... "sort."
help(bands.count)
so bands.count() checks the number of occurences of a value. Try it out:
bands.count("Ka")
- play with the other functions
- ... get back the most recent addition and simultaneously remove it
- from the list
bands.pop()
- ... find the first "Ka"
bands.index("Ka")
bands.index("Q")
- ... put K before Ka
bands.insert(bands.index("Ka"),"K")
bands
- You can explore the other abilities, removal, sorting, reversing
A First Look at Iteration
We'll see much more of this, but one of the most important aspects of a list is that you can easily iterate over it. This simple syntax will get each element in bands, in order of their syntax, print its value, and tell you what type it is.
for element in bands:
print element, type(element)
Unpacking
Again we'll see more of this but lists and other collections like tuples (again seen elsewhere) have the ability to "unpack." This means that they can be compactly assigned to a series of variables like so:
bands = ["X","Y","Z"] a, b, c = bands print a print b print c
Data Collections: Dictionaries
Dictionaries are another basic python data collection. Dictionaries store data in pairs, with a set of unique keys each mapped to some value (the values do not have to be unique). Dictionaries are extremely useful in scripting. For example, imagine that we are reducing a complex set of observations and want to be able to look up a the calibrator associated with a given source.
Control Flow
If
While
For
Break/Continue
More Complex Programs
Executing Scripts
The most basic way to execute a set of python commands (aside from just copying and pasting to the shell) is to use the execfile command. Calling execfile('myscript.py') from inside a python shell will execute 'myscript.py' one line at a time. You can use this to run a series of reduction commands or other simple scripts. In fact, calling execfile on one or more scripts will almost certainly be sufficient to script most basic CASA data reductions.
A simple example
You can combine the control flow that we learned above with execfile to refine your scripts. For example, you might have a sophisticated reduction path that requires a few user inputs, which could be collected at the top of the script as variables. The reduction might have several discrete parts, which you could turn on or off using booleans and if statements. As an example, try creating a file that holds the following:
An example with a bit of control flow.
As you edit the variables and booleans, various parts of the script will run tuned by the variables you set. This simple but powerful approach can (if you desire) form the infrastructure for a lot of your CASA reduction scripting.
Functions
Python allows you to define functions either from the command line (or an execfile call) or as part of modules.