Image Processing II

(c) 2022 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This document was prepared at Caltech with support financial support from the Donna and Benjamin M. Rosen Bioengineering Center.

This tutorial was generated from an Jupyter notebook. You can download the notebook here.


Segmentation

Segmentation is the process by which we separate regions of an image according to their identity for easier analysis. E.g., if we have an image of bacteria and we want to determine what is "bacteria" and what is "not bacteria," we would do some segmentation. We will use bacterial test images for this purpose.

For our segmentation, we will use the phase and CFP images of bacteria from the first tutorial. To remind ourselves of the images, let's take a look at the phase image.

Histograms

As we begin segmentation, remember that viewing an image is just a way of plotting the digital image data. We can also plot a histogram. This helps use see some patterns in the pixel values and is often an important first step toward segmentation.

The histogram of an image is simply a list of counts of pixel values. When we plot the histogram, we can often readily see breaks in which pixel values are most frequently encountered. There are many ways of looking at histograms. I'll show you my preferred way. This function is included in the bi1x module as bi1x.viz.im_hist(), but I show how to make the plot here so you can see how to use scikit-image to compute the histogram.

We see that there are two peaks to the histogram of the phase image. The peak to the right is brighter, so likely represents the background. Therefore, if we can find where the valley between the two peaks is, we may take pixels with intensity below that value to be bacteria and those above to be background. Eyeballing it, I think this critical pixel value is about 182.

Thresholding

The process of taking pixels above or below a certain value is called thresholding. It is one of the simplest ways to segment an image. We call every pixel with a value below 182 part of a bacterium and everything above not part of a bacterium.

We can overlay these images to get a good view. To do this, we will make an RGB image, and saturate the green channel where the thresholded image is white.

We see that we did a decent job finding bacteria, but we do not effectively label the bacteria in the middle of colonies. This is because of the "halo" of high intensity signal near boundaries of the bacteria that we get from using phase contrast microscopy.

Using the CFP channel

One way around this issue is to use bacteria that constitutively express a fluorescent protein and to segment in using the fluorescent channel. Let's try the same procedure with the CFP channel. First, let's look at the image.

We see that the bacteria are typically brighter than the background, so this might help us in segmentation.

Filtering noise: the median filter

It is strange that there do not appear to be any yellow (indicating high intensity) pixels in the display of the CFP image with the Viridis LUT. This is because there are some "hot" pixels in the image, resulting from noise or some other error in the detector. We can see this if we zoom in on one of the bad pixels.

We see a single bright pixel. This will throw off our lookup table. We can remove this noise by using a median filter. The concept is simple. We take a shape of pixels, called a structuring element, and pass it over the image. The value of the center pixel in the max is replaced by the median value of all pixels within the structuring element. To do this, we first need to construct a mask. This is done using the skimage.morphology module. The filtering is then done using skimage.filters.median().

Let's try it with a 3$\times$3 square mask.

Now that we have dealt with the noisy pixels, we can now see more clearly that some cells are very bright (shown in yellow) compared with others. We also have an image that makes more sense; we have eliminated the noise.

It is important to note that several cameras in the Bi 1x lab, especially the Flir cameras, have many hot pixels, and median filtering is an important step in processing those images.

Thresholding in the CFP channel

We'll proceed by plotting the histogram and finding the threshold value. Eyeballing it, I get a threshold value of 196.

Now let's try thresholding this image.

Looks like we're doing much better! Let's try overlapping the images now.

Much better, though we see that we overcount the spaces between bacteria.

Otsu's method for thresholding

It turns out that there are many automated ways to find the threshold value, as opposed to eyeballing it like we have been doing. Otsu's method provides this functionality.

We see that for the CFP channel, the Otsu method did very well. However, for phase, we see a big difference. This is because the Otsu method assumes a bimodal distribution of pixels. If we look at the histograms on a log scale, we see more clearly that the phase image has a long tail, which will trip up the Otsu algorithm. The moral of the story is that you can use automated thresholding, but you should always do sanity checks to make sure it is working as expected.

Determining the bacterial area

Now that we have a thresholded image, we can determine the total area taken up by bacteria. It's as simple as summing up the pixel values of the thresholded image!

For growth curves, we really only need the pixel values, since we are trying to get a time constant, which is independent of the unit of measure for bacterial numbers. Nonetheless, if we want to get the total area that is bacterial in units of square µm, we could use the interpixel distances to get the area represented by each pixel. For this setup, the interpixel distance is 0.0636 µm. We can then compute the bacterial area as follows.

Computing environment