NumpyBasics

From CASA Guides
Revision as of 15:28, 1 November 2011 by Aleroy (talk | contribs)
Jump to navigationJump to search

Back to the PythonOverview.

Introduction to NumPy

The built in python collections leave something to be desired for serious astronomical number crunching. Almost all current astronomical data processing inside python takes advantage of the numpy libraries. These are third party routines packaged with CASA that allow efficient definition and manipulation of matrices (or images or data cubes). This CASAguide will show you how to import this key module, build an array, do basic math, and cleverly access the contents of the array.

If you have worked with arrays in IDL, MatLab, or a similar language before the numpy approach should be familiar.

Importing NumPy

NumPy is not part of the basic python distribution. It and even some parts of the basic distribution are kept as separate modules. Assuming that your paths are all set up right, you can import numpy and gain access to its functionality in the following way.

import numpy as np

And now we can begin to use its functions like:

print( np.arange(10) )

And just to see how these arrays work, type:

print( np.arange(10) + 5 )

Notice how the 5 is broadcasted to each element in the array. Similarly, we can write a bunch of powers of 2 by:

print( 2**np.arange(10) )


Your First Numpy Array

Before we get going, type np. and hit teh tab key. You might be a bit freaked out by the large list of 563 possibilites, but it's good to know that's there. Also remember ?np.median will bring you up the same content as help np.median but with some useful header info.

Now lets make a simple numpy array and figure out how to get its basic properties. We can do this by hand like so:

ra = np.array([[0,1,2],[3,4,5]])
print(ra)

Or use the arange function to make the same array

ra = np.arange(6).reshape(2,3)
print(ra)

Let's get some basic stats. Again it may be useful to type ra. and tap first.

The shape of the array

ra.shape

Its total number of elements

ra.size

The number of dimensions

ra.ndim

Notice that we're not using () here, these are attributes of the array, ndim and size are ints, shape is a tuple, as we can see from:

type(ra.shape)

Now get the data type

ra.dtype

and notice that numpy has a slightly different nomenclature for the data types.

Also notice that some of the loosey-goosey casting is gone. Let's try to shove a float into our integer array.

print(ra)
ra[0,0] = 5.75
print(ra)

It got floored and entered as in int.

Now let's do something with our array, say square every element

ra**2

Or add it to itself

ra + ra

Notice it's doing element-by-element manipulation.

5 * ra

Array Creation

Now that we've seen some array basics, let's look a bit closer at how we can make them. There's a lot here, we'll just look at a couple of approaches.

We already saw the by-hand approach

ra = np.array([[0,1,2],[3,4,5]])
print(ra)
ra = np.array([0,1,2,3,4,5])
print(ra)

See how the square brackets control the shape? If you forget them entirely you'll get an unfortunate error.

We also saw the arange approach, which generates a sequential array of integers. It takes stop, start, and step as arguments like so:

ra = np.arange(5,10,1)
print(ra)
ra = np.arange(5,10,3)
print(ra)

It stops at the last element under 10

ra = np.arange(5,10.1,1.0)
print(ra)

See how the float step made the array a float?

ra = np.arange(5,10.1,1.0)
print(ra)

We can also give an explicity type to most array creation functions like so:

ra = np.arange(0,10,1,dtype=np.float32)
print(ra)

Otherwise it would have been an int

And there are many ways of doing similar things, a more stable float version of arange is linspace:

ra = np.linspace(1.0,2.0,11)
print(ra)

Which covers 1.0 to 2.0 inclusive with 11 elements.

We can also make arrays full of ones or zeros just specifying the size:

ra = np.zeros((3,3))
print(ra)
ra = np.zeros((3,3,3),dtype=np.bool)
print(ra)

And we can mimic one array with another

ara = np.zeros((3,3))
print(ara)
bra = np.zeros_like(ara)
print(bra)
bra = np.ones_like(ara)
print(bra)

Numpy will also implictly create arrays from array math:

ara = np.ones((3,3))
bra = ara + ara
print(bra)
cra = (ara == ara)
print(cra)

That Mutable Immutable Stuff

You need to be a bit careful copying arrays. If you simply do this

dra = ara

... it does NOT make a new array.

In detail try this:

test = np.array([1,2,3,4,5])
b = test
print(b)
test[2] = 6
print(b)

Notice how by changing test, we also change b!

In the above examples, dra just points at ara and b just points at test. The two in fact share the same data, so that changing one changes the other. You can check this via:

dra is ara
b is test

This gets a bit complicated but we would have been okay with:

b = test.copy()

or

dra = ara*1.0

so just be aware that there can be referencing issues with the sense that if you do not explicitly copy an array.

Math With Numpy

Slicing and Iteration With Numpy