Image Processing I¶

(c) 2024 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This document was prepared at Caltech with support financial support from the Donna and Benjamin M. Rosen Bioengineering Center.

No description has been provided for this image

This tutorial was generated from an Jupyter notebook. You can download the notebook here.


In this tutorial, we will learn some basic techniques for image processing using scikit-image with Python. As usual, we will begin by importing the modules we will use.

In [1]:
import numpy as np

import skimage.io
import skimage.exposure
import skimage.filters
import skimage.morphology

import bokeh.io
import bokeh.plotting

import colorcet

import bi1x

bokeh.io.output_notebook()
Loading BokehJS ...

Image processing tools for Python¶

There are many image processing tools available for Python. Some of them, such as ITK and OpenCV are mature image processing packages that have bindings for Python, allowing easy use of their functionality. Others were developed specifically for Python. Some of the many packages are

  • scikit-image
  • scipy.ndimage
  • Open CV (extensive computer vision package)
  • Cell Profiler (Broad Institute at MIT)
  • Insight Segmentation and Registration Toolkit (ITK, used in medical imaging, supported by the NIH)
  • Fiji and ImageJ support Jython scripting
  • ilastik using machine learning-based approaches to cell segmentation.
  • DeepCell is a deep learning-based package for segmenting images of various cell types developed by the Van Valen lab at Caltech.

The first two packages are standard with Anaconda. They provide a set of basic image processing tools, with more sophisticated packages such as ITK and Fiji supplying many more bells and whistles. If in the future you have more demanding image processing requirements, the other packages can prove very useful.

We will almost exclusively use scikit-image along with the standard tools from NumPy. We will call is skimage for short (which is how the package is imported anyhow). A potential annoyance with skimage is that the main package has minimal functionality, and you must import subpackages as needed. For example, to load and view images, you will need to import skimage.io. Importantly, skimage is well-documented, and you can access the documentation at http://scikit-image.org/.

We will explore skimage’s capabilities and some basic image processing techniques through example. We will start with examining a graticule (stage micrometer), then put together an multi-color image, and finally analyze fluorescence and phase contrast images of growing bacteria.

Calibrating an objective using a graticule¶

You should have downloaded the file 2014-03-24_graticule.tif. We will use this as our graticule sample image. Note that I have given it a descriptive name. A trick I find useful is to have the first 10 characters in the file name be the date. (I often include the time as well.) That way, images you are analyzing appear in chronological order when viewed on your computer. Following the date is a descriptive title for the image. I kept the name short for the purposes of this tutorial, but for my own files, I would name the file

2014-03-24-1456_graticule_10micron_increments_10x_wheels.tif

which specifies that I took this image at 2:56 PM on March 24, 2014 using the 10$\times$ objective on the microscope Wheels, and that the gradations on the graticule are ten microns apart. This is important to know going forward: we are calibrating a 10$\times$ objective using a graticule with 10 µm increments.

Loading and viewing the image¶

We load the image using the skimage.io.imread(). The image is stored as a NumPy array. Each entry in the array is a pixel value. This is an important point: a digital image is data. It is a set of numbers with spatial positions.

In [2]:
# Load images as NumPy array
grat_im = skimage.io.imread('2014-03-24_graticule.tif')

# Show that grat_im is a NumPy array
print('grat_im is a', type(grat_im))

# Show that the data type is unsigned int
print('Data type is', grat_im.dtype)

# Let's look at grat_im (only works in the IPython console)
grat_im
grat_im is a <class 'numpy.ndarray'>
Data type is uint8
Out[2]:
array([[198, 201, 201, ..., 196, 195, 198],
       [200, 198, 200, ..., 197, 197, 197],
       [200, 198, 195, ..., 195, 196, 198],
       ...,
       [199, 194, 197, ..., 193, 192, 193],
       [197, 198, 195, ..., 193, 194, 193],
       [199, 197, 197, ..., 193, 194, 194]], dtype=uint8)

An image is nothing more than a 2D array, or a matrix. Any operation we can do to a matrix, we can do to an image. That includes slicing, assignment, any mathematical operations; anything. Note also that the data type of grat_im is uint8, or unsigned 8 bit integer. This is the assumed bit depth of the image. Bit depth is the number of bits used to indicate the intensity of a single pixel in an image. For an 8-bit image, the pixel values for from $0$ to $2^8 − 1$, or from $0$ to $255$. (In general, low pixel values are dark and high pixel values are light.) skimage evaluates the pixel values in the image and then chooses which of its allowed data types are commensurate with the pixel values. skimage allows 8, 16, or 32 bit images, in addition to float images (with pixel values between 0 and 1). We will see the consequences of this later. For now, it suffices to know that our graticule image is 8 bit.

We have looked at the image as a set of numbers. As is often the case, it is useful to "plot" data. A common way to "plot" an image is to display it as we're used to seeing images. Bokeh allows us to do this with convenient zooming and scrolling. However, the interface is a bit low-level, so I wrote a higher level interface in the bi1x module.

In [3]:
bokeh.io.show(bi1x.viz.imshow(grat_im))

This image has what looks to be a strange color. This is because the bi1x.viz.imshow() function uses the viridis colormap by default, which is useful for visualizing intensity images (which is what gray scale images are). A colormap specifies how the pixel values are interpreted as they are displayed. For a viridis colormap, high pixel values are yellow and low pixel values are blue, with green in between. For a grayscale colormap, high pixel values are more white, and low pixel values are more black. If we want to look at the image in black and white, we need to specify the colormapper. The colorcet package has lots of great perceptual colormaps available, including grayscale.

In [4]:
# For this image, we'll use gray scale
gray_mapper = bokeh.models.LinearColorMapper(colorcet.gray)

# Show the image
bokeh.io.show(bi1x.viz.imshow(grat_im, color_mapper=gray_mapper))

Finally, I note that the axes are useful to have on the plot, but we can easily get rid of them.

Calibration with mouse clicks¶

To calibrate the distances in the digital images, we would like to be able to click the lines of the graticule and memorialize the positions of those clicks. The bi1x.viz.record_clicks() function allows this. The positions of the clicks are displayed below the plot of the image. Prior to making the clicks, you need to select the point clicker icon from the tool bar to the left of the image.

(Note that this interactivity is only possible in a running notebook, not in the HTML rendering of this tutorial.)

In [5]:
points_source = bi1x.viz.record_clicks(grat_im)

The result is a Bokeh ColumnDataSource, which you can convert to a Pandas DataFrame using the to_df() method.

In [6]:
click_locations = points_source.to_df()

# Take a look
click_locations
Out[6]:
x y
0 91.056849 632.499985
1 1444.903003 635.499985

We can get the interpixel distance by computing the distance between the x points by dividing 1000 µm by that distance.

In [7]:
ip_distance = 1000 / (click_locations['x'][1] - click_locations['x'][0])
ip_distance
Out[7]:
0.7386363636363636

So, the interpixel distance is 0.74 µm/pixel. When we display the image, we should have our axes in units of µm, not pixels. We can specify this with keyword arguments for bi1x.viz.imshow().

In [8]:
bokeh.io.show(
    bi1x.viz.imshow(
        grat_im,
        color_mapper=gray_mapper,
        interpixel_distance=ip_distance,
        length_units="µm",
    )
)

Traditionally, people burn scale bars into images. In this case, it would be a black bar of set length. This is akin to geologists putting their pick ax in photos to give a sense of scale. It is absolutely essential that the length scale is clear in any image you display for research purposes. So long as we display the image with the axes labeled with real length units, we do not need to burn a scale bar. This is preferred, because when zooming in on an image, you can lose the scale bar, but the axes remain.

Viewing images and lookup tables¶

We will now load and view the test images we will use for segmentation (described below). You should have downloaded the images ecoli_phase.tif and ecoli_cfp.tif. We'll start with an image taken using phase contrast microscopy of E. coli and view it in gray scale. For the bacterial image, the interpixel distance is 64 nm.

In [9]:
# Load image
im_phase = skimage.io.imread("ecoli_phase.tif")

# Update interpixel distance for these bacteria
ip_distance = 0.064

# Display the image
bokeh.io.show(
    bi1x.viz.imshow(
        im_phase,
        color_mapper=gray_mapper,
        interpixel_distance=ip_distance,
        length_units="µm",
    )
)

Lookup tables¶

I mentioned colormaps earlier. In image processing, a colormap is called a lookup table (LUT). A LUT is a mapping of pixel values to a color. This sometimes helps visualize images, especially when we use false coloring. Remember, a digital image is data, and false coloring an image is not manipulation of data (and really, there is nothing false about it). It is simply a different way of plotting it.

Let's look at this image with a few different colormaps.

In [10]:
# Choose lookup tables
mappers = [
    gray_mapper,
    bokeh.models.LinearColorMapper(bokeh.palettes.viridis(256)),
    bokeh.models.LinearColorMapper(colorcet.coolwarm),
    bokeh.models.LinearColorMapper(bokeh.palettes.magma(256)),
]

# Show the image for each LUT
plots = [
    bi1x.viz.imshow(
        im_phase,
        color_mapper=mapper,
        interpixel_distance=ip_distance,
        length_units="µm",
        frame_height=250,
    )
    for mapper in mappers
]

# Link the panning
for i in range(1, 4):
    plots[i].x_range = plots[0].x_range
    plots[i].y_range = plots[0].y_range

# Display
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))

We did a few new things here. First, we specified colormaps we wanted, starting with gray and viridis, the two we have already seen. We used colorcet's coolwarm colormap to give us a colormap that runs from blue to red going through white. This is called a diverging colormap, and can be useful for images like these because the middle-range intesities, which show the edges of the bacteria, are in white. We then used a list comprehension, which is kind of a fancy for loop for generating lists, to make a list of plots. We then used the bokeh.layouts() function to generate subplots.

As we just saw, we specify a lookup table with a colormap. The data visualization community seems to universally reject using rainbow colormaps. See, e.g., D. Borland and R. M. Taylor, Rainbow Color Map (Still) Considered Harmful, IEEE Computer Graphics and Applications, 27,14-17, 2007. Viridis is one of the best colormaps available, and was the color map used to display LIGO's groundbreaking "chirp" the verified the existence of gravitational waves. Viridis is therefore the default for bi1x.viz.imshow(), and will dominantly be the one we use in this class.

Importantly, the false coloring helps use see that the intensity of the pixel values near the center of some of the bacteria are not that dissimilar from the background. This will become an issue, as we will see, when segmenting, which will be in our next image processing tutorial.

Computing environment¶

In [11]:
%load_ext watermark
%watermark -v -p numpy,skimage,bi1x,bokeh,colorcet,jupyterlab
Python implementation: CPython
Python version       : 3.12.2
IPython version      : 8.20.0

numpy     : 1.26.4
skimage   : 0.22.0
bi1x      : 0.0.14
bokeh     : 3.4.0
colorcet  : 3.1.0
jupyterlab: 4.0.11