NumpyBasics: Difference between revisions

From CASA Guides
Jump to navigationJump to search
Line 45: Line 45:
===Your First Numpy Array===
===Your First Numpy Array===


Before we get going, type '''np.''' and hit teh tab key. You
Before we get going, type '''np.''' and hit the tab key. You
might be a bit freaked out by the large list of 563 possibilites, but it's good to know
might be a bit freaked out by the large list (~560 possibilites at time of writing), but it's good to know
that's there. Also remember '''?np.median''' will bring you up the same
that's there. Also remember that you can get help on each of these functions. Use '''?np.median''' to bring up the same
content as '''help np.median''' but with some useful header info.
content as '''help np.median''' but with some useful header info.


Now lets make a simple numpy array and figure out how to get its
First, let's make a simple numpy array and figure out how to get its basic properties. We can do this by hand like so:
basic properties. We can do this by hand like so:
 
<source lang="Python">
<source lang="Python">
ra = np.array([[0,1,2],[3,4,5]])
ra = np.array([[0,1,2],[3,4,5]])
Line 57: Line 57:
</source>
</source>


Or use the arange function to make the same array
Or use the '''arange''' function and the '''reshape''' method to make the same array like so:
 
<source lang="Python">
<source lang="Python">
ra = np.arange(6).reshape(2,3)
ra = np.arange(6).reshape(2,3)
Line 63: Line 64:
</source>
</source>


Let's get some basic stats. Again it may be useful to type ra. and tap first.
We now have an array stored in the variable "ra". Let's get its basic properties. Again it may be useful to type ra. and tap <tab> first.


The shape of the array
Get the shape of the array like so:
<source lang="Python">
<source lang="Python">
ra.shape
ra.shape
</source>
</source>
Its total number of elements
 
And its total number of elements like so:
<source lang="Python">
<source lang="Python">
ra.size
ra.size
</source>
</source>
The number of dimensions
 
Its number of dimensions:
<source lang="Python">
<source lang="Python">
ra.ndim
ra.ndim
Line 83: Line 86:
type(ra.shape)
type(ra.shape)
</source>
</source>
Now get the data type
 
Now get the data type of our array
<source lang="Python">
<source lang="Python">
ra.dtype
ra.dtype
</source>
</source>
and notice that numpy has a slightly different nomenclature for the
data types.


Also notice that some of the loosey-goosey casting is gone. Let's
Notice that numpy has a slightly different nomenclature for its data types than baseline python.
try to shove a float into our integer array.
 
Also notice that some of the loosey-goosey casting from basic python is gone. Let's try to shove a float into our integer array.
 
<source lang="Python">
<source lang="Python">
print(ra)
print(ra)
Line 97: Line 101:
print(ra)
print(ra)
</source>
</source>
It got floored and entered as in int.
 
Notice that it got floored and entered as in int.


Now let's do something with our array, say square every element
Now let's do something with our array, say square every element
Line 108: Line 113:
</source>
</source>


Notice it's doing element-by-element manipulation.
Notice it's doing ''element-by-element'' manipulation, the aforementioned '''broadcasting'''.
<source lang="Python">
<source lang="Python">
5 * ra
5 * ra

Revision as of 15:52, 1 November 2011

Back to the PythonOverview.

Introduction to NumPy

The built in python collections leave something to be desired for serious astronomical number crunching. Almost all current astronomical data processing inside python takes advantage of the numpy libraries. These are third party routines packaged with CASA that allow efficient definition and manipulation of matrices (or images or data cubes). This CASAguide will show you how to import this key module, build an array, do basic math, and cleverly access the contents of the array.

An overriding concern to keep in mind as you explore numpy is that numpy is fast and python is slow. What we mean is that the same operation using one of numpy's built in (C) functions will run orders of magnitude faster than using python to carry out the same operation via looping. If you have worked with arrays in IDL, MatLab, or a similar language before this approach should be familiar.

Importing NumPy

NumPy is not part of the basic python distribution. If you are not using CASA then you will need to download and install numpy. CASA includes numpy as pat of its basic distribution but keeps it as a separate module (similar to the "os" or "math" modules in the basic python distribution).

Once it is correctly installed, you can import numpy and gain access to its functionality in the following way.

import numpy as np

Note that this imports numpy as "np". Had we just typed "import numpy" you would have access to the functionality but substituting "numpy" for "np" throughout.

Now we can use numpy functions like so:

print( np.arange(10) )

"arange" is analogous to the basic python "range" but generates a numpy array.

Just to see how these arrays work, type:

print( np.arange(10) + 5 )

Notice how the 5 is broadcast to each element in the array. Similarly, we can write a bunch of powers of 2 by:

print( 2**np.arange(10) )

Your First Numpy Array

Before we get going, type np. and hit the tab key. You might be a bit freaked out by the large list (~560 possibilites at time of writing), but it's good to know that's there. Also remember that you can get help on each of these functions. Use ?np.median to bring up the same content as help np.median but with some useful header info.

First, let's make a simple numpy array and figure out how to get its basic properties. We can do this by hand like so:

ra = np.array([[0,1,2],[3,4,5]])
print(ra)

Or use the arange function and the reshape method to make the same array like so:

ra = np.arange(6).reshape(2,3)
print(ra)

We now have an array stored in the variable "ra". Let's get its basic properties. Again it may be useful to type ra. and tap <tab> first.

Get the shape of the array like so:

ra.shape

And its total number of elements like so:

ra.size

Its number of dimensions:

ra.ndim

Notice that we're not using () here, these are attributes of the array, ndim and size are ints, shape is a tuple, as we can see from:

type(ra.shape)

Now get the data type of our array

ra.dtype

Notice that numpy has a slightly different nomenclature for its data types than baseline python.

Also notice that some of the loosey-goosey casting from basic python is gone. Let's try to shove a float into our integer array.

print(ra)
ra[0,0] = 5.75
print(ra)

Notice that it got floored and entered as in int.

Now let's do something with our array, say square every element

ra**2

Or add it to itself

ra + ra

Notice it's doing element-by-element manipulation, the aforementioned broadcasting.

5 * ra

Array Creation

Now that we've seen some array basics, let's look a bit closer at how we can make them. There's a lot here, we'll just look at a couple of approaches.

We already saw the by-hand approach

ra = np.array([[0,1,2],[3,4,5]])
print(ra)
ra = np.array([0,1,2,3,4,5])
print(ra)

See how the square brackets control the shape? If you forget them entirely you'll get an unfortunate error.

We also saw the arange approach, which generates a sequential array of integers. It takes stop, start, and step as arguments like so:

ra = np.arange(5,10,1)
print(ra)
ra = np.arange(5,10,3)
print(ra)

It stops at the last element under 10

ra = np.arange(5,10.1,1.0)
print(ra)

See how the float step made the array a float?

ra = np.arange(5,10.1,1.0)
print(ra)

We can also give an explicity type to most array creation functions like so:

ra = np.arange(0,10,1,dtype=np.float32)
print(ra)

Otherwise it would have been an int

And there are many ways of doing similar things, a more stable float version of arange is linspace:

ra = np.linspace(1.0,2.0,11)
print(ra)

Which covers 1.0 to 2.0 inclusive with 11 elements.

We can also make arrays full of ones or zeros just specifying the size:

ra = np.zeros((3,3))
print(ra)
ra = np.zeros((3,3,3),dtype=np.bool)
print(ra)

And we can mimic one array with another

ara = np.zeros((3,3))
print(ara)
bra = np.zeros_like(ara)
print(bra)
bra = np.ones_like(ara)
print(bra)

Numpy will also implictly create arrays from array math:

ara = np.ones((3,3))
bra = ara + ara
print(bra)
cra = (ara == ara)
print(cra)

That Mutable Immutable Stuff

You need to be a bit careful copying arrays. If you simply do this

dra = ara

... it does NOT make a new array.

In detail try this:

test = np.array([1,2,3,4,5])
b = test
print(b)
test[2] = 6
print(b)

Notice how by changing test, we also change b!

In the above examples, dra just points at ara and b just points at test. The two in fact share the same data, so that changing one changes the other. You can check this via:

dra is ara
b is test

This gets a bit complicated but we would have been okay with:

b = test.copy()

or

dra = ara*1.0

so just be aware that there can be referencing issues with the sense that if you do not explicitly copy an array.

Math With Numpy

Slicing and Iteration With Numpy