EPITA 2021 MLRF practice_01-03_image-manipulations v2021-05-17_160644 by Joseph CHAZALON
Make sure you read and understand everything, and complete all the required actions.
Required actions are preceded by the following sign:
Perform a couple checks…
# deactivate buggy jupyter completion
%config Completer.use_jedi = False
Import the modules we already know
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
OpenCV is an open-source library originally developped at Intel. Its documentation is available here: https://docs.opencv.org/. Its current version is 4.3.0. It is written in C++ and had bindings for Java, Python and Javascript. It contains a huge number of algorithms and techniques for many computer vision problems.
We will use the Python wrapper and a pre-packaged Python package available here: https://pypi.org/project/opencv-python/. If necessary, it can be installed using pip:
pip install --user opencv-contrib-python-headless
OpenCV used to have a pretty readable documentation for the 2.x series, but the latest versions made it harder to develop Python code. However, its processing speed usually is fair and it is a good prototyping tool. Python samples usually a great source of inspiration; they are available under the sample/python source tree.
scikit-image is another open-source library moslty written in Python. It more recent but it gains power and it completes well OpenCV for some problems, even if it often is slower than OpenCV. Its API is also simpler to use in Python and its documentation (available here: http://scikit-image.org/docs/stable/) is very pleasant to read. If necessary, it can be installed using pip:
pip install --user scikit-image
We will go through the basic image manipulation tasks using both OpenCV and scikit-image. Later, you may need to use both of them so make sure to have a look at both. The good thing is that they both heavily rely on NumPy to represent image data, so an important part of what follows will actually be NumPy code!
Import new modules we will now use.
import cv2
import skimage
If the previous imports failed, you may now install the packages you need.
# TODO do you need to install packages?
uint8
or CV_8U
for most operations) but sometimes accepts uint16
/ CV_16U
or float32
/ CV_32F
values. Expect strange error messages from the the wrapper complaining about incompatible types. scikit-image, on the other hand, tends to work better with floats, though it supports other NumPy types well.size
parameter (to resize an image, to create a structuring element, etc.) and this size
usually is the reversed shape
of the resulting NumPy object! Just remember to be extra careful with the size
parameter of OpenCV functions.Let's get started!
You can read and write image using the skimage.io module.
# TODO read image an display its shape and its dtype
im_ski = None
#prof
im_ski = skimage.io.imread("img/practice_01/sample_img.png")
im_ski.shape, im_ski.dtype
((213, 320, 3), dtype('uint8'))
# TODO display image (hint: use matplotlib's imshow function)
plt.figure()
# plt.???
plt.show()
<Figure size 432x288 with 0 Axes>
#prof
plt.figure()
plt.imshow(im_ski)
plt.show()
# TODO
# ...
# prof
skimage.io.imsave("/tmp/test.jpg", im_ski)
!identify "/tmp/test.jpg"
/tmp/test.jpg JPEG 320x213 320x213+0+0 8-bit sRGB 9.81KB 0.000u 0:00.000
OpenCV exposes imread
and imsave
directly in the main namespace.
# TODO load image
im_ocv = None
# ...
# prof
im_ocv = cv2.imread("img/practice_01/sample_img.png")
im_ocv.shape, im_ocv.dtype # <- same shape as skimage
((213, 320, 3), dtype('uint8'))
# TODO display
# plt...
# prof
plt.figure()
plt.imshow(im_ocv)
plt.show()
Matplotlib expects images with channels in the ??? order, while OpenCV channels are stored in ... order.
Matplotlib expects images with channels in the RGB order, while OpenCV channels are stored in BGR order.
OpenCV makes is easy to convert colors between many formats. The cv2.cvtColor
function is the one you are looking for. Conversion code start with cv2.COLOR_*
.
# TODO new plot
# prof
plt.figure()
plt.imshow(cv2.cvtColor(im_ocv, cv2.COLOR_BGR2RGB))
plt.show()
You can select image areas using any NumPy indexing, slicing or masking method you like.
# TODO find the right selection code to erase the duck's head
im_test = im_ski.copy()
im_test[0,0] = 0
plt.figure()
plt.imshow(im_test)
plt.title("head-less duck")
plt.show()
# prof
im_test = im_ski.copy()
im_test[20:60,100:150] = 0
plt.figure()
plt.imshow(im_test)
plt.title("head-less duck")
plt.show()
Masking is a very powerful tool, allowing you to threshold images in a single line of code.
This shows you how to select regions of interest (ROIs) easily.
#TODO display duck head
# prof
plt.figure()
plt.imshow(im_ski[20:60,100:150])
plt.axis("off")
plt.title("duck head")
plt.show()
This will be a very naive segmentation, making good use of the properties of the image!
Tip 1: You can either choose to keep the foreground, or to remove the background.
Tip 2: use np.logical_and
or &
to combine boolean masks.
# TODO erase much of the background to segment the duck
from functools import reduce
im_test = im_ski.copy()
plt.figure(figsize=(12,8))
mask = ((im_test[...,0] >= 64) & (im_test[...,0] <= 255) # FIXME edit those lines
& (im_test[...,1] >= 64) & (im_test[...,1] <= 255)
& (im_test[...,2] >= 64) & (im_test[...,2] <= 255))
im_test[mask] = 0 # The mask is True for background pixels here, we erase selected pixels
plt.subplot(1, 2, 1)
plt.imshow(im_test)
plt.axis('off')
plt.title("background-less duck")
plt.subplot(1, 2, 2)
plt.axis('off')
plt.title("mask")
plt.imshow(mask, cmap='gray')
<matplotlib.image.AxesImage at 0x7f175c0664a8>
# prof
from functools import reduce
im_test = im_ski.copy()
plt.figure(figsize=(12,8))
# most of duck pixels have R intensity > 160
# mask = reduce(np.logical_and, [ # reduce trick because logical_and only takes 2 array arguments
# im_test[..., 0] >= 0, im_test[..., 0] <= 160,
# im_test[..., 1] >= 0, im_test[..., 1] <= 255,
# im_test[..., 2] >= 0, im_test[..., 2] <= 255])
# mask = ((im_test[...,0] >= 0) & (im_test[...,0] <= 160)
# & (im_test[...,1] >= 0) & (im_test[...,1] <= 255)
# & (im_test[...,2] >= 0) & (im_test[...,2] <= 255))
mask = np.all((im_test >= (0,0,0)) & (im_test <= (160, 255, 255)), axis=-1)
im_test[mask] = 0 # The mask is True for background pixels here, we erase selected pixels
plt.subplot(1, 2, 1)
plt.imshow(im_test)
plt.axis('off')
plt.title("background-less duck")
plt.subplot(1, 2, 2)
plt.axis('off')
plt.title("mask")
plt.imshow(mask, cmap='gray')
<matplotlib.image.AxesImage at 0x7f1756942b38>
It is (maybe) your first optimisation problem for image processing!
What you did was trying to find good parameters (thresholds) in the color space:
Could you automate that process? Could you compute the optimal parameters in some way?
TODO throw some quick ideas here.
(prof) An idea: compute the color distributions of the background and the foreground, and find a separation which divide the ambiguous region safely. Histograms would help a lot. Using another color space would probably also help.
We will learn new image processing functions on the go in the next stage: Twin it! part 1.