Smoothing Data by Rolling Average with NumPy

Time series data often comes with some amount of noise. One of the easiest ways to get rid of noise is to smooth the data with a simple uniform kernel, also called a rolling average. The title image shows data and their smoothed version. The data is the second discrete derivative from the recording of a neuronal action potential. Derivatives are notoriously noisy. We can get the result shown in the title image with np.convolve

import numpy as np

data = np.load("example_data.npy")
kernel_size = 10
kernel = np.ones(kernel_size) / kernel_size
data_convolved = np.convolve(data, kernel, mode='same')

Convolution is a mathematical operation that combines two arrays. One of those arrays is our data and we convolve it with the kernel array. During convolution we center the kernel at a data point. We multiple each data point in the kernel with each corresponding data point, sum up all the results and that is the new data point at the center. Let’s look at an example.

data = [2, 3, 1, 4, 1]
kernel = [1, 2, 3, 4]
np.convolve(data, kernel)
# array([ 2,  7, 13, 23, 24, 18, 19,  4])

For this result to make sense you must know, that np.convolve flips the kernel around. So step by step the calculations go as follows:

[4, 3, 2, 1]  # The flipped kernel
          x
         [2, 3, 1, 4, 1]  # The data
          2= 2
         [2]

   [4, 3, 2, 1]
          x  x
         [2, 3, 1, 4, 1]
          4+ 3= 7
         [2, 7]

      [4, 3, 2, 1]
          x  x  x
         [2, 3, 1, 4, 1]
          6+ 6+ 1=13
         [2, 7, 13]

         [4, 3, 2, 1]
          x  x  x  x
         [2, 3, 1, 4, 1]
          8+ 9+ 2+ 4= 23
         [2, 7,13,23]
# This continues until the arrays stop touching. You get the idea.

One thing you’ll notice is that the edges are problematic. There is really no good way to avoid that. Data points at the edges only see part of the kernel but the mode parameter defines what should happen at the edges. I prefer the 'same' mode because it means that the new array will have the same shape as the original data, which makes plotting easier. However, if you start to use more complicated kernels, the edges might become virtually useless. In that case, mode should be 'valid'. Then, the values at the edges that did not see the entire kernel are discarded. The output array is smaller in shape than the input array.

data = [2, 3, 1, 4, 1]
kernel = [1, 2, 3, 4]
np.convolve(data, kernel, mode='valid')
array([23, 24])

The default behavior you saw above is called 'full'. It keeps all data points, so the output array is larger in shape than the input array. You might also have noticed that the size of the kernel is very important. Actually, we need to divide the array of ones by its length. Can you guess what would happen if we forgot about dividing it?

If you guessed that the signal would become larger in magnitude you guessed right. We would be summing up all data points in the kernel. By dividing it we ensure that we take the average of the data points. But the kernel size is even more important. If we make the kernel larger the outcome changes dramatically.

kernel_size = 10
kernel = np.ones(kernel_size) / kernel_size
data_convolved_10 = np.convolve(data, kernel, mode='same')

kernel_size = 20
kernel = np.ones(kernel_size) / kernel_size
data_convolved_20 = np.convolve(data, kernel, mode='same')

plt.plot(data_convolved_20)
plt.plot(data_convolved_10)
plt.legend(("Kernel Size 10", "Kernel Size 20"))

The larger we make the kernel, the smaller sharp peaks become. The peaks are also shifted in time. To be specific, a rolling mean is a low-pass filter. This means that is leaves low frequency signals alone, while making high frequency signals smaller. Sharp increases in the data have a high frequency. If we make the kernel larger, the filter attenuates high frequency signals more. This is exactly how the rolling average works. It gets rid of high frequency noise. It also means that we must be careful not to distort the signal too much with the rolling average filter.