An image says more than a thousand words but histograms are also very important. Digital images are made of pixels and each of them has a value. A histogram tells us how many pixels of the image have a certain value. The title plot shows Chelsea the cat and the histograms for each color channel. Here is the code that generated the figure.
import numpy as np import skimage import matplotlib.pyplot as plt image = skimage.data.chelsea() image_red, image_green, image_blue = image[:,:,0], image[:,:,1], image[:,:,2] fig, ax = plt.subplots(2,3) ax[0,0].imshow(image_red, cmap='gray') ax[0,1].imshow(image_green, cmap='gray') ax[0,2].imshow(image_blue, cmap='gray') bins = np.arange(-0.5, 255+1,1) ax[1,0].hist(image_red.flatten(), bins = bins, color='r') ax[1,1].hist(image_green.flatten(), bins=bins, color='g') ax[1,2].hist(image_blue.flatten(), bins=bins, color='b')
Because Chelsea is part of the scikit-image example data, we can simply load it with
image.shape we can find out that our image has three dimensions. The first two are y and x coordinates whereas the third one represents the colors red, green and blue (RGB). We split the colors into their own variables before visualizing each of them as a grayscale image and below it we plot the histogram. Here is a short version of the above code with some slightly advanced Python features.
fig, ax = plt.subplots(2,3) bins = np.arange(-0.5, 255+1,1) for ci, c in enumerate('rgb'): ax[0,ci].imshow(image[:,:,ci], cmap='gray') ax[1,ci].hist(image[:,:,ci].flatten(), bins = bins, color=c)
We can see from the histogram and the grayscale image that Chelsea is slightly more red than blue or green. But how can we get more quantitative information out of the histogram? We can use
np.histogram and the usual numpy functions to learn more about the properties of our histograms.
hist_red = np.histogram(image_red.flatten(), bins=bins) hist_red.argmax() # 156
The np.histogram function gives us a tuple, where the first entry are the counts and the second entry are the bin edges. This is the reason we have to index into
hist_red to call .
argmax() on the correct array.
.argmax() tells us that the peak of the histogram is at bin 156. This means that most pixels have an intensity value of 156. The peak can be deceiving, especially when the distribution is skewed or multi-modal but for this tutorial we will accept it as a first pass. Let’s see how the other channels look.
hist_red = np.histogram(image_red.flatten(), bins=bins) green = np.histogram(image_green.flatten(), bins=bins) hist_blue = np.histogram(image_blue.flatten(), bins=bins) print(hist_red.argmax(), hist_green.argmax(), hist_blue.argmax()) # 156 116 97
As our eyes suspected, the green and blue channel have peaks at smaller intensity values than the red channel. This confirms our suspicion that Chelsea probably is a red cat. I hope this tutorial has been helpful to get you started with scikit-image. We learned that RGB images come in an array of shape
(y, x, c), where
c is the color channel. We can use
plt.hist() to calculate and plot the histogram and
np.hist() to calculate the histogram without plotting.