NumpyBasics: Difference between revisions
Line 203: | Line 203: | ||
</source> | </source> | ||
===That Mutable Immutable Stuff=== | ===Copies and Views: That Mutable/Immutable Stuff=== | ||
Numpy arrays are objects and as a result you need to be a bit careful copying them. If you simply do this | |||
<source lang="Python"> | <source lang="Python"> | ||
Line 211: | Line 211: | ||
</source> | </source> | ||
it does ''not'' make a new array. | |||
In detail try this: | In detail try this: |
Revision as of 19:53, 1 November 2011
Back to the PythonOverview.
Introduction to NumPy
The built in python collections leave something to be desired for serious astronomical number crunching. Almost all current astronomical data processing inside python takes advantage of the numpy libraries. These are third party routines packaged with CASA that allow efficient definition and manipulation of matrices (or images or data cubes). This CASAguide will show you how to import this key module, build an array, do basic math, and cleverly access the contents of the array.
An overriding concern to keep in mind as you explore numpy is that numpy is fast and python is slow. What we mean is that the same operation using one of numpy's built in (C) functions will run orders of magnitude faster than using python to carry out the same operation via looping. If you have worked with arrays in IDL, MatLab, or a similar language before this approach should be familiar.
Importing NumPy
NumPy is not part of the basic python distribution. If you are not using CASA then you will need to download and install numpy. CASA includes numpy as pat of its basic distribution but keeps it as a separate module (similar to the "os" or "math" modules in the basic python distribution).
Once it is correctly installed, you can import numpy and gain access to its functionality in the following way.
import numpy as np
Note that this imports numpy as "np". Had we just typed "import numpy" you would have access to the functionality but substituting "numpy" for "np" throughout.
Now we can use numpy functions like so:
print( np.arange(10) )
"arange" is analogous to the basic python "range" but generates a numpy array.
Just to see how these arrays work, type:
print( np.arange(10) + 5 )
Notice how the 5 is broadcast to each element in the array. Similarly, we can write a bunch of powers of 2 by:
print( 2**np.arange(10) )
Your First Numpy Array
Before we get going, type np. and hit the tab key. You might be a bit freaked out by the large list (~560 possibilites at time of writing), but it's good to know that's there. Also remember that you can get help on each of these functions. Use ?np.median to bring up the same content as help np.median but with some useful header info.
First, let's make a simple numpy array and figure out how to get its basic properties. We can do this by hand like so:
ra = np.array([[0,1,2],[3,4,5]])
print(ra)
Or use the arange function and the reshape method to make the same array like so:
ra = np.arange(6).reshape(2,3)
print(ra)
We now have an array stored in the variable "ra". Let's get its basic properties. Again it may be useful to type ra. and tap <tab> first.
Get the shape of the array like so:
ra.shape
And its total number of elements like so:
ra.size
Its number of dimensions:
ra.ndim
Notice that we're not using () here, these are attributes of the array, ndim and size are ints, shape is a tuple, as we can see from:
type(ra.shape)
Now get the data type of our array
ra.dtype
Notice that numpy has a slightly different nomenclature for its data types than baseline python.
Also notice that some of the loosey-goosey casting from basic python is gone. Let's try to shove a float into our integer array.
print(ra)
ra[0,0] = 5.75
print(ra)
Notice that it got floored and entered as in int.
Now let's do something with our array, say square every element
ra**2
Or add it to itself
ra + ra
Notice it's doing element-by-element manipulation, the aforementioned broadcasting.
5 * ra
Array Creation
Now that we've seen some array basics, let's look a bit closer at how we can make them. There's a lot here, we'll just look at a couple of approaches.
We already saw the by-hand approach
ra = np.array([[0,1,2],[3,4,5]])
print(ra)
ra = np.array([0,1,2,3,4,5])
print(ra)
See how the square brackets control the shape? If you forget them entirely you'll get an unfortunate error.
We also saw the arange approach, which generates a sequential array of integers. It takes stop, start, and step as arguments like so:
ra = np.arange(5,10,1)
print(ra)
ra = np.arange(5,10,3)
print(ra)
It stops at the last element under 10
ra = np.arange(5,10.1,1.0)
print(ra)
See how the float step made the array a float?
ra = np.arange(5,10.1,1.0)
print(ra)
We can also give an explicity type to most array creation functions like so:
ra = np.arange(0,10,1,dtype=np.float32)
print(ra)
Otherwise it would have been an int
And there are many ways of doing similar things, a more stable float version of arange is linspace:
ra = np.linspace(1.0,2.0,11)
print(ra)
Which covers 1.0 to 2.0 inclusive with 11 elements.
We can also make arrays full of ones or zeros just specifying the size:
ra = np.zeros((3,3))
print(ra)
ra = np.zeros((3,3,3),dtype=np.bool)
print(ra)
And we can mimic one array with another
ara = np.zeros((3,3))
print(ara)
bra = np.zeros_like(ara)
print(bra)
bra = np.ones_like(ara)
print(bra)
Numpy will also implictly create arrays from array math:
ara = np.ones((3,3))
bra = ara + ara
print(bra)
cra = (ara == ara)
print(cra)
Copies and Views: That Mutable/Immutable Stuff
Numpy arrays are objects and as a result you need to be a bit careful copying them. If you simply do this
dra = ara
it does not make a new array.
In detail try this:
test = np.array([1,2,3,4,5])
b = test
print(b)
test[2] = 6
print(b)
Notice how by changing test, we also change b!
In the above examples, dra just points at ara and b just points at test. The two in fact share the same data, so that changing one changes the other. You can check this via:
dra is ara
b is test
This gets a bit complicated but we would have been okay with:
b = test.copy()
or
dra = ara*1.0
so just be aware that there can be referencing issues with the sense that if you do not explicitly copy an array.