Numpy

Authors: Tom Dunham
Date: 2009-04-05

Numpy

numpy == Numeric Python. A library that provides the "n-dimensional array":

N-dimensional array

>>> from numpy import array
>>> b = array([[1,2,3], [4,5,6], [7,8,9], [10,11,12]])
>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

Shape

Arrays have a shape

>>> b.shape
(4, 3)

This array is four units long in the 1st dimension (rows), and three units long in the 2nd dimension (columns).

If there were more dimensions in the array, the tuple would be longer ((2,4,3) would be 4 "tall", 3 "long" and 2 "deep").

That second example:

>>> import numpy as np
>>> ex = np.arange(1, 25).reshape((2,4,3))
>>> ex
array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18],
        [19, 20, 21],
        [22, 23, 24]]])
>>> ex.shape
(2, 4, 3)

The arange function just creates a range of numbers in an array:

>>> np.arange(1, 25)
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13,
        14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])

N-dimensional array

Multi-dimensional arrays need an index in each dimension to specify a single value:

>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
>>> b[1][1]
5
>>> b[1,1]
5

Slicing

>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
>>> b[1,1]
5
>>> b[1:3,1]
array([5, 8])

Slicing

>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
>>> b[1:3,1:3]
array([[5, 6],
       [8, 9]])

Creating arrays

You can create arrays

List->Array

>>> array([[1,2,3], [4,5,6]])
array([[1, 2, 3],
       [4, 5, 6]])

Arrays are homogeneous:

>>> array([[1.0,2,3], [4,5,6]])
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Arange

>>> arange(0,12)
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
>>> arange(0,12).reshape(4, 3)
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

Notice that all the items in the range must be in the final array. The shape of this array is 4 by 3, and 4*3 == 12, and a range of 0-12 fits nicely. A range that doesn't "fit" is an error:

>>> arange(1,12).reshape(4, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: total size of new array must be unchanged

Zeros

You can create an array of any shape containing all zeros:

>>> from numpy import zeros
>>> zeros((3,4))
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

Zeros

By default these zeros are represented as floating point numbers. To change the type:

>>> zeros((3,4), dtype="int")
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

Exercise

See handout

  1. Create a (one-dimensional) array (called ar) containing the values 10,20,30,40,50,60.

    • What is the array's shape?
    • What is the value of ar[:3]?
    • What is the value of ar[3:]?
    • Use a for loop to double every value in ar in-place.
  2. Create a multi-dimensional array that's 20 rows by 10 columns, where the elements count up row wise. The top and bottom rows will look like this - you fill in the middle rows:

    array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
           [ 10,  11,  12,  13,  14,  15,  16,  17,  18,  19],
           ...
           [180, 181, 182, 183, 184, 185, 186, 187, 188, 189],
           [190, 191, 192, 193, 194, 195, 196, 197, 198, 199]])
    
  1. Times-tables. Create and print an array containing the times tables from one to twelve. Hint - You can do this by initializing the array and using nested loops to assign every element or by creating the array from nested lists. If you have time, try both methods (it is possible to do this in one line using list comprehensions).

Slicing

So far, we have done slicing of the form:

myseq[start:stop]

eg:

>>> seq
'AGGAGCAAACTGATGCCCTG'
>>> seq[3:9]
'AGCAAA'

Extended Slicing

Extended slicing adds the step parameter:

myseq[start:stop:step]

eg:

>>> seq
'AGGAGCAAACTGATGCCCTG'
>>> seq[3:9:3]
'AA'

Extended slicing

Steps can also go backwards:

>>> seq
'AGGAGCAAACTGATGCCCTG'
>>> seq[::-1]
'GTCCCGTAGTCAAACGAGGA'

Multidimensional

Numpy supports n-dimensional extended slicing:

>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
>>> b[1:4, 1:3]
array([[ 5,  6],
       [ 8,  9],
       [11, 12]])

Multidimensional

>>> b
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])
>>> b[1:4:2, 1:3:2]
array([[ 5],
       [11]])