- Broadcasting is triggered when an arithmetic operation is done on two arrays of different shape
- The goal of broadcasting is to make both arrays the same shape by performing transformations on the shape of the smaller array
- Once arrays have the same shape, the operation is applied element-wise
- If the arrays cannot be broadcast an error is raised
When we try to add two arrays together with the plus operator, addition is performed element-wise. That means, each element is added to a corresponding element is the other array. However, this only works when both arrays have the same shape. If two arrays have different shapes, a process called broadcasting tries to resolve the difference between the arrays by performing a series of transformations on the shape of the array with lower dimensionality. To understand broadcasting we need to understand the steps broadcasting performs. Lets look at a quick example.
import numpy as np arr_one = np.array([[4, 3, 2, 5, 6, 2], [30, 34, 1, 50, 60, 56], [22, 34, 32, 21, 12, 6]]) arr_two = np.array([1, 10, 20, 30, 40, 50]) arr_one.shape (3, 6) arr_two.shape (6,) arrs_plus = arr_one + arr_two arrs_plus array([[ 5, 13, 22, 35, 46, 52], [ 31, 44, 21, 80, 100, 106], [ 23, 44, 52, 51, 52, 56]])
This one works despite both arrays having different shapes, even different number of elements. This next one does not work under seemingly similar circumstances.
arr_one = np.array([[4, 3, 2, 5, 6, 2], [30, 34, 1, 50, 60, 56], [22, 34, 32, 21, 12, 6]]) arr_two = np.array([4, 40, 20]) arr_one.shape (3, 6) arr_two.shape (3,) arrs_plus = arr_one + arr_two ValueError: operands could not be broadcast together with shapes (3,6) (3,)
What happened here? In the first example we add an array of shape (6,) to an array of shape (3, 6) and it works. In the second example we add an array of shape (3,) to a (3, 6) array and get an error. In both examples, the arrays have different shapes. Therefor, broadcasting is triggered and to understand what happens we need to understand the broadcasting sequence. Lets first work through the working example.
# Broadcasting rules in order """ Rule #1: The array with fewer dimensions is broadcast to match Rule #2: Array shapes are aligned to the right. (3, 6) (,6) Rule #3: All array dimensions must be equal or one Otherwise broadcasting fails as: ValueError Here 6 is equal to six, so we don't get an error and continue Rule #4: Array dimensions are expanded in the leftward direction (6, 3) (1, 3) Rule #5: Array dimensions of size 1 are duplicated to match. (6, 3) (6, 3) Done. Array operation can now be executed element wise. """
These are the five broadcasting rules that are followed in order. They explain why operations between a (3,) and a (3, 6) array fail. After aligning both array shapes we encounter a problem. 3 does not equal 6, so rule #3 is violated and gives us the ValueError. There are several ways to make this operation work. However, it is best to first understand array indexing before delving into those. We will learn about array indexing in the next post.
If it weren’t for broadcasting we would have to manually convert the shape of arrays so that they are equal before we can perform arithmetic operations on them. Luckily, we learned that broadcasting always happens when we want to perform an operation on two arrays of different shapes. It tries to resolve the difference in shape by performing a series of steps on the smaller array. When broadcasting finishes successfully the operation can be performed element-wise. If broadcasting fails an error is raised.