EPITA 2023 MLRF practice_02-01_detection-description v2023-05-24_134912 by Joseph CHAZALON
This work is licensed under a Creative Commons Attribution 4.0 International License.
Our goal here is to build a framework which will compare pre-selected bubbles (using color histograms) in a way more robust than template matching (which is skipped this year).
The issue with template matching is that we need to slide the template over the test image, eventually producing supurious correlations (over flat regions for instance).
What we will do instead is to find stable points within texture patterns, and compare the region centered around those points.
This part contains the following steps:
We will implement a Harris corner detector now.
A corner is an image patch which exhibits a high difference with itself when translated in any direction, ie where $$(I(x+\Delta_x,y+\Delta_y) - I(x,y))^2$$ is high, $(\Delta_x,\Delta_y)$ being a displacement vector.
In practice we pool the previous indicator function over a small region $R = [-s,s] \times [-s,s]$ and we use a weighting function $w(u,v)$ (with $(u,v) \in R$) to weight the contribution of each displacement to the global sum: $$ S(x,y) = \sum_u \sum_v w(u,v) \, \left( I(x+u+\Delta_x,y+v+\Delta_y) - I(x+u,y+v)\right)^2 $$ where $u$ and $v$ both take values in $[-s,+s]$ if $s$ is the size of the neighborhood we consider. Usually $w(u,v)$ is either a box filter or a Gaussian.
$I(x+\Delta_x,y+\Delta_y)$ can be approximated by a Taylor expansion: $$ I(x+\Delta_x,y+\Delta_y) \approx I(x,y) + \Delta_x \frac{\partial I(x,y)}{\partial x} + \Delta_y \frac{\partial I(x,y)}{\partial y} + \cdots $$ This allows us to "simplify" the original equation, and more important making it faster to compute, thanks to simpler derivatives which can be computed for the whole image: $$ S(x,y) \approx \sum_u \sum_v w(u,v) \left( \Delta_x \frac{\partial I(x+u,y+v)}{\partial x} + \Delta_y \frac{\partial I(x+u,y+v)}{\partial y} \right)^2 $$ which can be rewritten as: $$ S(x,y) \approx \sum_u \sum_v w(u,v) \left( (\Delta_x \frac{\partial I(x+u,y+v)}{\partial x})^2 + (\Delta_y \frac{\partial I(x+u,y+v)}{\partial y})^2 + 2 \Delta_x \Delta_y \frac{\partial I(x+u,y+v)}{\partial x} \frac{\partial I(x+u,y+v)}{\partial y} \right) $$ or, using the matrix form which is usually given: $$ S(x,y) \approx \begin{pmatrix} \Delta_x & \Delta_y \end{pmatrix} A(x,y) \begin{pmatrix} \Delta_x \\ \Delta_y \end{pmatrix}, $$ where $A(x,y)$ is the structure tensor: $$ A = \sum_u \sum_v w(u,v) \begin{bmatrix} \frac{\partial I^2(x+u,y+v)}{\partial x} & \frac{\partial I(x+u,y+v)}{\partial x} \frac{\partial I(x+u,y+v)}{\partial y} \\ \frac{\partial I(x+u,y+v)}{\partial x} \frac{\partial I(x+u,y+v)}{\partial y} & \frac{\partial I^2(x+u,y+v)}{\partial y} \end{bmatrix} = \begin{bmatrix} \langle I_x^2 \rangle & \langle I_x I_y \rangle\\ \langle I_x I_y \rangle & \langle I_y^2 \rangle \end{bmatrix} $$ where:
This trick is useful because $I_x$ and $I_y$ can be precomputed very simply.
A corner (or in general an interest point) is characterized by a large variation of $S$ in all directions of the vector $\begin{pmatrix} x & y \end{pmatrix}$. By analyzing the eigenvalues of $A$, this characterization can be expressed in the following way: $A$ should have two "large" eigenvalues for an interest point. Based on the magnitudes of the eigenvalues, the following inferences can be made based on this argument:
To avoid the computation of the eigenvalues, which used to be expensive, Harris and Stephens instead suggest the following function $M_c$, where $\kappa$ is a tunable sensitivity parameter:
$$ M_c = \lambda_1 \lambda_2 - \kappa \, (\lambda_1 + \lambda_2)^2 = \operatorname{det}(A) - \kappa \, \operatorname{trace}^2(A) $$Or, if we prefer to avoid setting the parameter $\kappa$, we can use Noble's corner measure $M_c'$ which amounts to the harmonic mean of the eigenvalues: $$ M_c' = 2 \frac{\operatorname{det}(A)}{\operatorname{trace}(A) + \epsilon}, $$ $\epsilon$ being a small positive constant.
$A$ being a 2x2 matrix, we have the following relations:
Using previous definitions, we obtain:
A naive way of computing an approximation of $I_x$ (resp. $I_y$) would be to simply compute the difference between pixel value in horizontal (resp. vertical) directions. np.gradient
would then compute something like:
$$I_x(x,y) = I(x,y) - I(x+1,y),$$
which produces spiky results and lead to poor performance.
So, in practice we smooth the image before computing the gradients. A common practice is to perform a small Gaussian blur before applying the Sobel operator in both directions, but we can be even more efficient by convolving our image with $G_x$ and $G_y$, two Gaussian derivative kernels with respect to $x$ and $y$!
We can relate that to algebraic properties of convolutions:
$$\begin{array}{l} \textrm{Derivative}_x * \textrm{Gaussian} * \textrm{Image} &= \textrm{Gaussian} * \textrm{Derivative}_x * \textrm{Image}\\ &= (\textrm{Gaussian} * \textrm{Derivative}_x) * \textrm{Image}\\ &= (\textrm{Sobel}_x) * \textrm{Image} \end{array}$$So, to compute $I_x$ or $I_y$, one only needs to convolve $I$ with a Gaussian derivative kernel with respect to $x$ or $y$.
In summary, given an image, we can compute the Harris corner response image by simply computing:
Enough math, let's code! We will guide you along the process, so be reassured.
After some setup, we will implement the computation of the response image progressively.
# deactivate buggy jupyter completion
%config Completer.use_jedi = False
import numpy as np
import cv2
import matplotlib.pyplot as plt
#%matplotlib inline
import os
# TODO
PATH_TO_RESOURCES = "." # FIXME set this to the path of the twinit resource directory
# prof
PATH_TO_RESOURCES = "/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it"
A Gaussian function is a function of the form: $$ f(x) = a e^{-(x-b)^2/(2c^2)} $$ The graph of a Gaussian is a characteristic symmetric "bell curve" shape. The parameter $a$ is the height of the curve's peak, $b$ is the position of the center of the peak and $c$ (the standard deviation) controls the width of the "bell".
In our case, we have a two-dimensional signal therefore our function has the form: $$f(x,y) = a \exp\left(- \left(\frac{(x-x_o)^2}{2\sigma_X^2} + \frac{(y-y_o)^2}{2\sigma_Y^2} \right)\right).$$
Here the coefficient $a$ is the amplitude, $x_0,y_0$ is the center and $\sigma_x,\sigma_y$ are the $x$ and $y$ spreads of the blob.
In our case, we will define our Gaussian kernel with respect to the size $s$ of the window $w$. We want to have:
The derivatives are easy to compute in our case.
Complete the function below to generate a Gaussian kernel.
Tips:
np.mgrid
which generates a grid of coordinates.# Run me!
np.mgrid[-1:1+1,-1:1+1]
array([[[-1, -1, -1], [ 0, 0, 0], [ 1, 1, 1]], [[-1, 0, 1], [-1, 0, 1], [-1, 0, 1]]])
Tip: you have to implement this:
$$g(x,y) = \exp\left(- \left(\frac{(x)^2}{2\sigma_X^2} + \frac{(y)^2}{2\sigma_Y^2} \right)\right)$$with $$\sigma_X = \sigma_Y = \frac{1}{3} \text{size}$$
# TODO complete the code of this function
def gauss_kernel(size: int) -> np.array:
"""
Returns a 2D Gaussian kernel for convolutions.
Parameters
----------
size: int
Size of the kernel to build
Returns
-------
kernel: np.array of shape (size, size) and dtype np.float32
Resulting Gaussian kernel where kernel[i,j] = Gaussian(i, j, mu=(0,0), sigma=(size/3, size/3))
"""
size = int(size)
y, x = np.mgrid[-size:size+1, -size:size+1]
# x and y coefficients of a 2D gaussian with standard dev half of size
# (ignore scale factor)
# FIXME this is a box filter, adapt it to be a Gaussian
# (this would also work, but would lead to poorer performance)
g = np.ones((size*2+1, size*2+1))
return g
# prof
def gauss_kernel(size: int, sizey: int=None) -> np.array:
"""
Returns a 2D Gaussian kernel for convolutions.
Parameters
----------
size: int
Size of the kernel to build
Returns
-------
kernel: np.array of shape (size, size) and dtype np.float32
Resulting Gaussian kernel where kernel[i,j] = Gaussian(i, j, mu=(0,0), sigma=(size/3, size/3))
"""
size = int(size)
sizey = int(sizey) if sizey is not None else size
y, x = np.mgrid[-size:size+1, -sizey:sizey+1]
# x and y coefficients of a 2D gaussian with standard dev half of size
# (ignore scale factor)
g = np.exp(-(x**2/(2*(0.33*size)**2)+y**2/(2*(0.33*sizey)**2)))
return g
Let us quickly display this kernel to check its shape.
We are looking for something which looks like this, ie a maximal response at the center of the window, then a smooth decrease toward the borders of the window where we have values close to 0:
# Run this cell
from mpl_toolkits.mplot3d import Axes3D
# Run this cell
wsize = 5
Y,X = np.mgrid[-wsize:wsize+1,-wsize:wsize+1]
Z = gauss_kernel(wsize)
print(X.shape, Y.shape, Z.shape)
# print(Z)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z);
(11, 11) (11, 11) (11, 11)
Complete the function below to generate two Gaussian derivative kernels, then plot them.
Tips:
We are looking for Gaussian derivatives which look like this:
from typing import Tuple
# TODO complete this function
def gauss_derivative_kernels(size: int) -> Tuple[np.array, np.array]:
"""
Returns two 2D Gaussian derivative kernels (x and y) for convolutions.
Parameters
----------
size: int
Size of the kernels to build
Returns
-------
(gx, gy): tupe of (np.array, np.array), each of shape (size, size) and dtype np.float32
Resulting Gaussian kernels where kernel[i,j] = Gaussian_z(i, j, mu=(0,0), sigma=(size/3, size/3))
where Gaussian_z is either the x or the y Gaussian derivative.
"""
size = int(size)
y, x = np.mgrid[-size:size+1, -size:size+1]
#x and y derivatives of a 2D gaussian with standard dev half of size
# (ignore scale factor)
gx = np.ones((size*2+1, size*2+1)) # FIXME
gy = np.ones((size*2+1, size*2+1)) # FIXME
return gx,gy
# prof
def gauss_derivative_kernels(size: int, sizey: int=None) -> Tuple[np.array, np.array]:
"""
Returns two 2D Gaussian derivative kernels (x and y) for convolutions.
Parameters
----------
size: int
Size of the kernels to build
Returns
-------
(gx, gy): tupe of (np.array, np.array), each of shape (size, size) and dtype np.float32
Resulting Gaussian kernels where kernel[i,j] = Gaussian_z(i, j, mu=(0,0), sigma=(size/3, size/3))
where Gaussian_z is either the x or the y Gaussian derivative.
"""
size = int(size)
sizey = int(sizey) if sizey is not None else size
y, x = np.mgrid[-size:size+1, -sizey:sizey+1]
#x and y derivatives of a 2D gaussian with standard dev half of size
# (ignore scale factor)
gx = - x * np.exp(-(x**2/(2*(0.33*size)**2)+y**2/(2*(0.33*sizey)**2)))
gy = - y * np.exp(-(x**2/(2*(0.33*size)**2)+y**2/(2*(0.33*sizey)**2)))
return gx,gy
# Run this cell
# Plot gx and gy
wsize = 5
Y,X = np.mgrid[-wsize:wsize+1,-wsize:wsize+1]
Gx, Gy = gauss_derivative_kernels(wsize)
fig = plt.figure(figsize=(10,5))
ax = fig.add_subplot(121, projection='3d')
ax.plot_surface(X, Y, Gx)
ax.set_title("Gx")
ax.set_xlabel("X")
ax.set_ylabel("Y")
ax = fig.add_subplot(122, projection='3d')
ax.plot_surface(X, Y, Gy)
ax.set_title("Gy")
ax.set_xlabel("X")
ax.set_ylabel("Y");
Complete the function below to compute $I_x$ and $I_y$.
Tips:
scipy.signal.convolve
to apply the kernels.from scipy import signal
signal.convolve?
# TODO complete this function
def gauss_derivatives(im: np.array, size: int) -> Tuple[np.array, np.array]:
"""
Returns x and y gaussian derivatives for a given image.
Parameters
----------
im: np.array of shape (rows, cols)
Input image
size: int
Size of the kernels to use
Returns
-------
(Ix, Iy): tupe of (np.array, np.array), each of shape (rows, cols)
Derivatives (x and y) of the image computed using Gaussian derivatives (with kernel of size `size`).
"""
gx,gy = gauss_derivative_kernels(size)
imx = np.zeros_like(im) # FIXME
imy = np.zeros_like(im) # FIXME
return imx,imy
# prof
def gauss_derivatives(im: np.array, size: int, sizey: int=None) -> Tuple[np.array, np.array]:
"""
Returns x and y gaussian derivatives for a given image.
Parameters
----------
im: np.array of shape (rows, cols)
Input image
size: int
Size of the kernels to use
Returns
-------
(Ix, Iy): tupe of (np.array, np.array), each of shape (rows, cols)
Derivatives (x and y) of the image computed using Gaussian derivatives (with kernel of size `size`).
"""
gx,gy = gauss_derivative_kernels(size, sizey=sizey)
imx = signal.convolve(im, gx, mode='same')
imy = signal.convolve(im, gy, mode='same')
return imx,imy
Load some bubble images and show their derivatives.
Tips:
# Some Jupyter magic to help you
# This creates a SORTED list of files to process.
bubble_files = !ls $PATH_TO_RESOURCES/bubbles_200dpi/b*.png | sort
bubble_files[:3]
['/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b001.png', '/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b002.png', '/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b003.png']
# load all the bubbles
bubbles = [cv2.imread(ff) for ff in bubble_files]
# list of bubbles (np.array) in grayscale
bubbles_gray = [cv2.cvtColor(bb, cv2.COLOR_BGR2GRAY) for bb in bubbles]
# (just run this cell)
# Display some bubbles and their derivatives
# Let us save time
def imbgr2rgb(img):
return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
num_bb = 5
plt.figure(figsize=(8,10))
for bb_id in range(num_bb):
bb = bubbles[bb_id]
bb_gray = bubbles_gray[bb_id]
bb_x, bb_y = gauss_derivatives(bb_gray, 3)
plt.subplot(num_bb,3,1+3*bb_id)
plt.imshow(imbgr2rgb(bb))
plt.axis("off")
plt.title("$I$: bb_%d" % bb_id)
plt.subplot(num_bb,3,2+3*bb_id)
plt.imshow(bb_x)
plt.colorbar(shrink=0.5)
plt.axis("off")
plt.title("$I_x$")
plt.subplot(num_bb,3,3+3*bb_id)
plt.imshow(bb_y)
plt.colorbar(shrink=0.75)
plt.axis("off")
plt.title("$I_y$")
We can now compute the Harris response of an image.
Compute and show the Harris response for some bubbles.
Tips:
Expected result:
# TODO complete this function
def compute_harris_response(image: np.array) -> np.array:
"""
Returns the Harris cornerness response of a given image.
Parameters
----------
im: np.array of shape (rows, cols)
Input image
Returns
-------
response: np.array of shape (rows, cols) and dtype np.float32
Harris cornerness response image.
"""
DERIVATIVE_KERNEL_SIZE = 1 # FIXME try different values here
OPENING_SIZE = 1 # FIXME try different values here
#derivatives
imx,imy = gauss_derivatives(image, DERIVATIVE_KERNEL_SIZE)
#kernel for weighted sum
gauss = gauss_kernel(OPENING_SIZE) # opening param
#compute components of the structure tensor
Wxx = np.zeros_like(image) # FIXME
Wxy = np.zeros_like(image) # FIXME
Wyy = np.zeros_like(image) # FIXME
#determinant and trace
Wdet = np.zeros_like(image) # FIXME
Wtr = np.zeros_like(image) # FIXME
# return Wdet - k * Wtr**2 # k is hard to tune
# return Wdet / Wtr # we would need to filter NaNs
return Wdet / (Wtr + 1) # 1 seems to be a reasonable value for epsilon
# prof
def compute_harris_response(image): #, k=0.05):
"""
Returns the Harris cornerness response of a given image.
Parameters
----------
im: np.array of shape (rows, cols)
Input image
Returns
-------
response: np.array of shape (rows, cols) and dtype np.float32
Harris cornerness response image.
"""
DERIVATIVE_KERNEL_SIZE = 3
OPENING_SIZE = 3
#derivatives
imx,imy = gauss_derivatives(image, DERIVATIVE_KERNEL_SIZE)
#kernel for weighted sum
gauss = gauss_kernel(OPENING_SIZE) # opening param
#compute components of the structure tensor
Wxx = signal.convolve(imx*imx,gauss, mode='same')
Wxy = signal.convolve(imx*imy,gauss, mode='same')
Wyy = signal.convolve(imy*imy,gauss, mode='same')
#determinant and trace
Wdet = Wxx*Wyy - Wxy**2
Wtr = Wxx + Wyy
# print(Wdet.min(), Wdet.max(), Wdet.mean())
# print(Wtr.min(), Wtr.max(), Wtr.mean())
# return Wdet - k * Wtr**2 # k is hard to tune
# return Wdet / Wtr # we would need to filter NaNs
return Wdet / (Wtr + 1) # 1 seems to be a reasonable value for epsilon
# display some bubbles and their Harris response, showing the scale of values
num_bb = 5
plt.figure(figsize=(12,10))
for bb_id in range(num_bb):
bb = bubbles[bb_id]
bb_gray = bubbles_gray[bb_id]
bb_x, bb_y = gauss_derivatives(bb_gray, 4)
bb_h = compute_harris_response(bb_gray)
plt.subplot(num_bb,4,1+4*bb_id)
plt.imshow(imbgr2rgb(bb))
plt.axis("off")
plt.title("$I$: bb_%d" % bb_id)
plt.subplot(num_bb,4,2+4*bb_id)
plt.imshow(bb_x)
plt.colorbar(shrink=0.5)
plt.axis("off")
plt.title("$I_x$")
plt.subplot(num_bb,4,3+4*bb_id)
plt.imshow(bb_y)
plt.colorbar(shrink=0.75)
plt.axis("off")
plt.title("$I_y$")
plt.subplot(num_bb,4,4+4*bb_id)
plt.imshow(bb_h)
plt.colorbar(shrink=0.75)
plt.axis("off")
plt.title("Harris resp.")
Now we have computed the "cornerness map" of an image, we have to select the best corner candidates. The full process for selecting keypoints (ie coordinates of corners) can be decomposed in:
The masking step is a bit special in our case because our bubbles have an artificial boundary inside the image which can cause keypoints to be detected.
We want to reject keypoints which are too close from the bubble boundary to avoid this.
To save you time, we provide you with a simple function bubble2maskeroded
which takes a bubble and returns a mask which removes the border of the bubble.
# RUN ME
# mathematical morphology magic: this returns an eroded (shrunk) mask
def bubble2maskeroded(img_gray: np.array, border: int=10) -> np.array:
"""
Returns the eroded mask of a given image, to remove pixels which are close to the border.
Parameters
----------
im: np.array of shape (rows, cols)
Input image
Returns
-------
mask: np.array of shape (rows, cols) and dtype bool
Image mask.
"""
if img_gray.ndim > 2:
raise ValueError(
"""bubble2maskeroded: img_gray must be a grayscale image.
The image you passed has %d dimensions instead of 2.
Try to convert it to grayscale before passing it to bubble2maskeroded.
""" % (img_gray.ndim, ))
mask = img_gray > 0
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (border*2,border*2))
# new: added a little closing below because some bubbles have some black pixels inside
mask_er = cv2.morphologyEx(mask.astype(np.uint8), cv2.MORPH_CLOSE, np.ones((3,3)))
mask_er = cv2.erode(mask.astype(np.uint8),
kernel,
borderType=cv2.BORDER_CONSTANT,
borderValue=0)
return mask_er > 0
# RUN ME
# bubble2maskeroded demo
bb_id = 5
plt.figure(figsize=(10,4))
plt.subplot(141)
plt.imshow(bubbles_gray[bb_id], cmap='gray')
plt.axis('on'); plt.title("Bubble");
plt.subplot(142)
plt.imshow(bubbles_gray[bb_id] > 0)
plt.axis('on'); plt.title("Mask");
plt.subplot(143)
plt.imshow(bubble2maskeroded(bubbles_gray[bb_id]))
plt.axis('on'); plt.title("Eroded mask");
plt.subplot(144)
plt.imshow((bubbles_gray[bb_id] > 0) & np.logical_not(bubble2maskeroded(bubbles_gray[bb_id])))
plt.axis('on'); plt.title("Removed border");
Complete the function below to build a complete Harris corner detector.
Tips:
np.nonzero
and try it separately.np.argsort
to sort values.# prof
# What follows is a breakdown of the different parts of the detection, given some Harris response map
# prof
# Masking
image_gray = bubbles_gray[bb_id]
harrisim = compute_harris_response(image_gray)
# 2. Masking
#find top corner candidates above a threshold
threshold=0.1
corner_threshold = harrisim.max() * threshold
harrisim_mask = harrisim > corner_threshold
# apply the mask to ignore the bubble contours and image borders
min_distance=10
harrisim_mask &= bubble2maskeroded(image_gray, border=min_distance)
# prof
plt.figure(figsize=(12,8))
plt.subplot(131)
plt.imshow(image_gray, cmap="gray")
plt.subplot(132)
plt.imshow(harrisim)
plt.subplot(133)
plt.imshow(harrisim_mask)
<matplotlib.image.AxesImage at 0x7fe3e03a82e0>
# prof
# responses overlaid on the image
diplay_image = np.dstack((image_gray, np.uint8(image_gray+np.abs(harrisim)/harrisim.max()*255), image_gray))
plt.imshow(diplay_image)
<matplotlib.image.AxesImage at 0x7fe3e148adf0>
# prof
# illustration of the dilation used for NMS (non-maximal suppression):
# we propagate the value of a local maxima to its neighborhood,
# as we will later keep only the single maximal point in each of these neighborhoods
dil = cv2.dilate(harrisim, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (20, 20)))
# Pro tip: if we have several maxima with exactly the same value, we can add a small noise to make one of them win
plt.imshow(dil)
<matplotlib.image.AxesImage at 0x7fe3c46b6d60>
# prof
detect = (np.isclose(dil, harrisim) # keep only local maximas
& (harrisim > harrisim.min()+0.5*(harrisim.max()-harrisim.min())) # response threshold
& bubble2maskeroded(image_gray, border=min_distance)) # bg and border removal
plt.imshow(detect)
<matplotlib.image.AxesImage at 0x7fe3ab4dabe0>
# prof
# this thansposition is just a way to get the coordinates in the right format
filtered_coords = np.transpose(detect.nonzero())
filtered_coords[:10]
array([[ 33, 160], [ 55, 105], [ 55, 165], [ 55, 227], [ 61, 253], [ 72, 74], [ 79, 152], [ 93, 174], [106, 218], [109, 90]])
# prof
# and so we get these nice detections
plt.figure(figsize=(12,8))
plt.subplot(131)
plt.imshow(image_gray, cmap="gray")
plt.axis("off")
plt.title("Original")
plt.subplot(132)
plt.imshow(harrisim)
plt.axis("off")
plt.title("Harris response")
plt.subplot(133)
plt.imshow(image_gray, cmap='gray')
plt.plot(filtered_coords[:,1], filtered_coords[:,0], 'x', c='r')
plt.axis("off")
plt.title("Corners")
Text(0.5, 1.0, 'Corners')
# TODO complete this function
def detect_harris_points(image_gray: np.array, max_keypoints: int=30,
min_distance: int=25, threshold: float=0.5) -> np.array:
"""
Detects and returns a sorted list of coordinates for each corner keypoint detected in an image.
Parameters
----------
image_gray: np.array
Input image
max_keypoints: int, default=30
Number of keypoints to return, at most (we may have less keypoints)
min_distance: int, default=25
Minimum distance between two keypoints
threshold: float, default=0.1
For each keypoint k_i, we ensure that its response h_i will verify
$h_i > min(response) + threshold * (max(reponse) - min(response))$
Returns
-------
corner_coord: np.array of shape (N, 2) and dtype int
Array of corner keypoint 2D coordinates, with N <= max_keypoints
"""
# 1. Compute Harris corner response
harris_resp = compute_harris_response(image_gray)
# 2. Filtering
# 2.0 Mask init: all our filtering is performed using a mask
detect_mask = np.ones(harris_resp.shape, dtype=bool)
# 2.1 Background and border removal
detect_mask &= bubble2maskeroded(image_gray, border=min_distance)
# 2.2 Response threshold
detect_mask &= True # FIXME <------------------------ # remove low response elements
# 2.3 Non-maximal suppression
# dil is an image where each local maxima value is propagated to its neighborhood (display it!)
dil = cv2.dilate(harris_resp, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (min_distance, min_distance)))
# we want to keep only elements which are local maximas in their neighborhood
detect_mask &= True # FIXME <------------ # keep only local maximas by comparing dil and harris_resp
# 3. Select, sort and filter candidates
# get coordinates of candidates
candidates_coords = np.transpose(detect_mask.nonzero())
# ...and their values
candidate_values = harris_resp[detect_mask]
# sort candidates
sorted_indices = None # FIXME <----------------------
# keep only the bests
best_corners_coordinates = candidates_coords # FIXME <-----------------------
return best_corners_coordinates
# prof
def detect_harris_points(image_gray: np.array, max_keypoints: int=30,
min_distance: int=25, threshold: float=0.1) -> np.array:
"""
Detects and returns a sorted list of coordinates for each corner keypoint detected in an image.
Parameters
----------
image_gray: np.array
Input image
max_keypoints: int, default=30
Number of keypoints to return, at most (we may have less keypoints)
min_distance: int, default=25
Minimum distance between two keypoints
threshold: float, default=0.1
For each keypoint k_i, we ensure that its response h_i will verify
$h_i > min(response) + threshold * (max(reponse) - min(response))$
Returns
-------
corner_coord: np.array of shape (N, 2) and dtype int
Array of corner keypoint 2D coordinates, with N <= max_keypoints
"""
# 1. Compute Harris corner response
harris_resp = compute_harris_response(image_gray)
# 2. Filtering
# 2.0 Mask init: all our filtering is performed using a mask
detect_mask = np.ones(harris_resp.shape, dtype=bool)
# 2.1 Background and border removal
detect_mask &= bubble2maskeroded(image_gray, border=min_distance)
# 2.2 Response threshold
detect_mask &= harris_resp > harris_resp.min()+threshold*(harris_resp.max()-harris_resp.min())
# 2.3 Non-maximal suppression
dil = cv2.dilate(harris_resp, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (min_distance, min_distance)))
detect_mask &= np.isclose(dil, harris_resp) # keep only local maximas
# 3. Select, sort and filter candidates
# get coordinates of candidates
candidates_coords = np.transpose(detect_mask.nonzero())
# ...and their values
candidate_values = harris_resp[detect_mask]
#sort candidates
sorted_indices = np.argsort(candidate_values)
# keep only the bests
best_corners_coordinates = candidates_coords[sorted_indices][:max_keypoints]
return best_corners_coordinates
# prof (old, pre-2022 version, without the elegant NMS – don't use it)
# def detect_harris_points(image_gray, min_distance=25, threshold=0.1):
# """ detect and return corners from a grayscale image
# min_distance is the minimum nbr of pixels separating
# corners and image boundary
# returns a list of (x,y) coordinates of corners.
# """
# # 1. Compute Harris corner response
# harrisim = compute_harris_response(image_gray)
# # 2. Masking
# #find top corner candidates above a threshold
# corner_threshold = harrisim.max() * threshold
# harrisim_mask = harrisim > corner_threshold
# # apply the mask to ignore the bubble contours and image borders
# harrisim_mask &= bubble2maskeroded(image_gray, border=min_distance)
# # filter the borders (useless thanks to previous line)
# # harrisim_mask[:min_distance,:] = False # Top border
# # harrisim_mask[-min_distance:,:] = False # Bottom border
# # harrisim_mask[:,:min_distance] = False # Left border
# # harrisim_mask[:,-min_distance:] = False # Right border
# # 3. Candidate pre-selection and sorting
# #get coordinates of candidates
# candidates = harrisim_mask.nonzero()
# coords = np.transpose(candidates)
# #...and their values
# candidate_values = harrisim[candidates]
# #sort candidates
# index = np.argsort(candidate_values)
# # 4. Max filter over neighborhood
# #select the best points taking min_distance into account
# filtered_coords = []
# for i in index:
# if harrisim_mask[coords[i][0]][coords[i][1]]:
# filtered_coords.append(coords[i])
# # update the mask to prevent neighbors from being selected
# harrisim_mask[max(0, coords[i][0]-min_distance):
# min(coords[i][0]+min_distance, harrisim_mask.shape[0]),
# max(0, coords[i][1]-min_distance):
# max(coords[i][1]+min_distance, harrisim_mask.shape[1])] = False
# return np.array(filtered_coords)
# (just run this cell)
# Display some bubbles and the detected keypoints
num_bb = 6
min_distance=25
plt.figure(figsize=(8,16))
for bb_id in range(num_bb):
bb = bubbles[bb_id]
bb_gray = bubbles_gray[bb_id]
bb_h = compute_harris_response(bb_gray)
filtered_coords = detect_harris_points(bb_gray,
min_distance=min_distance)
plt.subplot(num_bb,3,1+3*bb_id)
plt.imshow(imbgr2rgb(bb))
plt.axis("off")
plt.title("$I$: bb_%d" % bb_id)
plt.subplot(num_bb,3,2+3*bb_id)
plt.imshow(bb_h)
plt.axis("off")
plt.title("Harris resp.")
plt.subplot(num_bb,3,3+3*bb_id)
plt.imshow(bb_gray, cmap='gray')
plt.plot(filtered_coords[:,1], filtered_coords[:,0], 'x', c='r')
plt.axis("off")
plt.title("Corners")
This step is actually quite simple here: we will simply store the raw BGR pixel values around a given keypoint, and store them as a flat vector. This is the simplest descriptor we could use. This will let us reuse the image comparison technique we saw during the session 1 in the next part.
We will extract small patches as in the following example:
Complete the function below to compute a list of descriptors centered around each keypoint.
Tips:
# TODO complete the function below
def compute_descriptors(image_color, filtered_coords, wid=24):
"""
For each point return pixel values around the point
using a neighbourhood of width 2*wid+1.
Discard any patch which to not have a shape of (2*wid+1, 2*wid+1, 3).
return a list of descriptors (list of np.array).
"""
desc = []
for coords in filtered_coords:
pass # FIXME
return desc
# prof
def compute_descriptors(image_color, filtered_coords, wid=24):
"""
For each point return pixel values around the point
using a neighbourhood of width 2*wid+1.
Discard any patch which to not have a shape of (2*wid+1, 2*wid+1, 3).
return a list of descriptors (list of np.array).
"""
desc = []
for coords in filtered_coords:
patch = image_color[max(0, coords[0]-wid):
min(coords[0]+wid+1, image_color.shape[0]),
max(0, coords[1]-wid):
min(coords[1]+wid+1, image_color.shape[0])].flatten()
if len(patch) == 3*(2*wid+1)**2:
desc.append(patch)
return desc
# Here is some display to check the results
bb_id = 3
num_descr = 5
patch_half_size = 12
kpts = detect_harris_points(bubbles_gray[bb_id], min_distance=patch_half_size+1)
desc = compute_descriptors(bubbles[bb_id], kpts, wid=patch_half_size)
plt.figure(figsize=(10,4))
plt.subplot(1,num_descr+1,1)
plt.imshow(bubbles_gray[bb_id], cmap='gray')
plt.plot(kpts[:,1], kpts[:,0], 'x', c='r')
plt.axis("off")
plt.title("Corners")
for ii in range(min(num_descr, len(desc))):
plt.subplot(1,num_descr+1,2+ii)
dside = int(np.sqrt(desc[ii].shape[0]/3)) # recover the patch size
plt.imshow(cv2.cvtColor(desc[ii].reshape((dside, dside, 3)), cv2.COLOR_BGR2RGB))
plt.plot([dside//2], [dside//2], '*', c='r')
# plt.axis('off')
plt.title("Descr. %d" % ii)
This will be useful for the next part, to accelerate the computation of the matching.
Pre-compute all the keypoints and descriptors for all bubbles.
Tips:
# TODO precompute keypoints and descriptors for all bubbles using `bubbles` and `bubbles_gray`
patch_half_size = 1 # FIXME
# list of list of keypoints for all bubbles: keypoints[i][j]: (np.array) keypoint j for bubbles_gray[i]
keypoints = []
# list of list of descriptors for all bubbles: descriptors[i][j]: (np.array) desc j for bubbles[i]
descriptors = []
# FIXME
print(len(keypoints), len(descriptors))
0 0
# prof
patch_half_size = 25
keypoints = []
descriptors = []
for bb_color, bb_gray in zip(bubbles, bubbles_gray):
# kpts = detect_harris_points(bb_gray, min_distance=patch_half_size+1)
kpts = detect_harris_points(bb_gray, min_distance=10, threshold=0.1)
# desc = compute_descriptors(bb_color, kpts, wid=patch_half_size)
desc = compute_descriptors(bb_color, kpts, wid=12)
keypoints.append(kpts)
descriptors.append(desc)
print(len(keypoints), len(descriptors))
391 391
# Sanity check: do we have bubbles without keypoints?
for ii, ki in enumerate(keypoints):
if len(ki) == 0:
plt.figure()
plt.imshow(bubbles_gray[ii], cmap='gray')
plt.axis('off')
plt.title("b%03d.png" % (ii+1))
Save the pre-computed keypoints and descriptors: we will use then in the next part.
Tips:
np.savez_compressed
or any similar function to save objects.# save keypoints and descriptors somewhere
# TODO adapt this code to your actual parameters and variable names
# NOTE: we want to keep lists are we have unequal numbers of keypoints and descriptors for each bubble
# FIXME (Joseph): using joblib would be much better here
np.savez_compressed("results/kpts_descr_harris_25pxcolor_mdist10.npz",
descriptors=descriptors,
keypoints=keypoints)
# Check output
!ls -lh results/kpts_descr_harris_25pxcolor_mdist10.npz
/home/jchazalo/.virtualenvs/mlrf21-py3.8/lib/python3.8/site-packages/numpy/core/_asarray.py:171: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. return array(a, dtype, copy=False, order=order, subok=True)
-rw-r--r-- 1 jchazalo jchazalo 14M avril 5 12:13 results/kpts_descr_harris_25pxcolor_mdist10.npz
Now you're ready to move on to the next stage: Match keypoints and solve Twin it!.