Array Indexing with NumPy

Indexing is used to retrieve or change elements of a an array
Slice syntax (start:stop:step) gets a range of elements
Integer and boolean arrays can get an arbitrary set of elements

Introduction to Array Indexing

Indexing is an important feature that allows us to retrieve and reassign specific parts on an array. You probably already know the basics of indexing from Python lists and tuples. You can index into NumPy arrays the same way you index into those sequences but NumPy indexing comes with many extra features we will learn about here. First, lets look at single value indexing.

Single Value Indexing

We can use indexing to get single (scalar) values from an array. Indexing is always done with square brackets and we always start counting at 0.

import numpy as np
arr = np.arange(10,15)
arr
array([10, 11, 12, 13, 14])
arr[0]
10
arr[4]
14

Note that single value indexing does not return an array with a single entry but rather a numpy integer. To get a single value from a multi dimensional array we need to use multiple indices that are separated by commas.

arr = np.arange(20)
arr = arr.reshape((2,2,5))
arr
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]],

       [[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]]])
arr[0,0,1]
1
arr[1,0,4]
14

I recommend this way of indexing but you can also use multiple square brackets like you would for Python sequences.

arr
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]],

       [[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]]])
arr[0][0][0]
0
arr[1][0][4]
14

We can also use indexing to reassign elements of an array.

arr = np.arange(10,15)
arr
# array([10, 11, 12, 13, 14])
arr[1] = 20
arr
# array([10, 20, 12, 13, 14])

Slice Indexing

To retrieve a single value, our indices need to resolve all dimensions of the array and arrive at a single value. Whenever one dimension remains unspecified, we get an array (array view technically).

arr = np.array([[[ 0,  1,  2,  3,  4],
                 [ 5,  6,  7,  8,  9]],
                [[10, 11, 12, 13, 14],
                 [15, 16, 17, 18, 19]]])
arr[0, 1]
array([5, 6, 7, 8, 9])
arr[1]
array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

To take an entire dimension we can use the colon.

arr = np.array([[[ 0,  1,  2,  3,  4],
                 [ 5,  6,  7,  8,  9]],
                [[10, 11, 12, 13, 14],
                 [15, 16, 17, 18, 19]]])
arr[0, :, 0]
array([0, 5])
arr[:, 0, 0]
array([ 0, 10])

The colon is very useful for indexing in general, because it allows us to take a slice of values instead of a single value. The syntax of the slice follows start:stop:step. If we leave out start, the slice starts at 0. If we leave out stop, it goes to the end of the dimension. If we leave out step, the step defaults to 1.

arr = np.array([[[ 0,  1,  2,  3,  4],
                 [ 5,  6,  7,  8,  9]],
                [[10, 11, 12, 13, 14],
                 [15, 16, 17, 18, 19]]])
arr[0, 0, 1:5:2]
# array([1, 3])
arr[0, 0, 1:4]
# array([1, 2, 3])
arr[0, 0, 1:]
# array([1, 2, 3, 4])
arr[0, 0, :3]
# array([0, 1, 2])
arr[0, 0, :]
# array([0, 1, 2, 3, 4])

Index Array

So far we learned that we can use integers and slices for indexing. Now we learn that we can also use arrays to index into an array. When we use an array to index, that array has to either contain integers or boolean values. Lets take a look at integer array indexing first.

arr = np.arange(10,50,3)
idc = np.arange(5)
idc.dtype
dtype('int32')
arr[idc]
array([10, 13, 16, 19, 22])
idc = np.arange(5,8)
arr[idc]
array([25, 28, 31])
idc = np.array([1,2,4])
arr[idc]
array([13, 16, 22])

Note that in the examples where we generate index arrays with arange, we could achieve the same result with a slice as shown above and save one line of code. Integer arrays are most useful when they are generated by a process that is more complicated than the arange method. One example is the np.argwhere method we will learn more about in a later post.

Boolean Array

Boolean arrays also deserve at least one post of their own but here I will give you a teaser. We only want to retrieve those values, that satisfy a larger than condition.

arr = np.array([[[ 0,  1,  2,  3,  4],
                 [10, 11, 12, 13, 14]],
                 [[5,  6,  7,  8,  9],
                 [15, 16, 17, 18, 19]]])
boolean_idc = arr > 10
boolean_idc
array([[[False, False, False, False, False],
        [False,  True,  True,  True,  True]],

       [[False, False, False, False, False],
        [ True,  True,  True,  True,  True]]])
arr[boolean_idc]
array([11, 12, 13, 14, 15, 16, 17, 18, 19])

Summary

We learned that indexing is useful to retrieve values and reassign parts of an array. There are several ways to index. First, we can use single integers to get to an element of a certain dimension. We can also use slices with the colon syntax start:stop:step to get at a sequence of elements. Furthermore, there are two advances indexing techniques, where we can use arrays containing integers or booleans to find an arbitrary collection of elements.

Simulation-Based