EPITA 2023 MLRF practice_02-02_twinit-part2-matching v2023-05-24_134912 by Joseph CHAZALON

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Practice 02 part 02: Match keypoints and solve Twin it!¶

In this part we will reuse pre-computed elements from the previous parts:

  • the distance matrix between bubbles computed from color histograms;
  • keypoints and descriptors for each bubble.

The idea is to select pairs of bubbles which are close according to the color histogram, then to compare the descriptors extracted from each of them. Based on the number of near-identical matches, we will return a much compact list of twin candidates.

This last part is decomposed into 3 steps:

  1. Prepare a matching framework to compare sets of descriptors.
  2. Reload all the pre-computed elements from the previous parts.
  3. Solve Twin it!.

1. Prepare a matching framework¶

Given two list of descriptors, $D_1$ and $D_2$ (which actually are flattened color image patches), we want to identify the matching pairs.

Instead of using a distance like the sum of squared differences, we will use a scoring approach, therefore the higher the score the better the matching.

This will be performed in three steps:

  1. Find for each element $d_i \in D_1$ the best match $\hat{d_j} \in D_2$, ie build the set
$$ \{ (d_i,\hat{d_j}) \mid \hat{d_j} = \underset{d_j \in D_2}{\mathrm{argmax}} \operatorname{score}(d_i, d_j) \}, $$

with the constraint that the matching score of two elements is above a minimal threshold: $$\operatorname{score}(d_i, d_j) > T.$$ In practice we only store the indices of the matching pairs. 2. Perform the reverse operation, find for each element $d_j \in D_2$ its best match $\hat{d_i} \in D_1$. 3. Keep only the matches which "agree", ie pairs that are in both sets.

No need to use large descriptors to test this step: we know our descriptors are 1-dimensional NumPy arrays, so you can test very simple cases to check your method before running it on large descriptors.

In [1]:
# deactivate buggy jupyter completion
%config Completer.use_jedi = False
In [2]:
import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
import os
In [3]:
# TODO
PATH_TO_RESOURCES = "."  # FIXME set this to the path of the twinit resource directory
In [4]:
# prof
PATH_TO_RESOURCES = "/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it"

Normalized cross correlation¶

The scoring method we will use to compare the descriptors (color image patches) had the following formula, where $d_i$ and $d_j$ are two descriptors of the same size (3*patchside**2): $$ \operatorname{ncc}(d_i, d_j) = \frac{1}{|d_i|} \sum{ \frac{d_i - \bar{d_i}}{\sigma_{d_i} + \epsilon} \times \frac{d_j - \bar{d_j}}{\sigma_{d_j} + \epsilon} } $$ where:

  • $|d_i| = |d_j|$ is the length of the descriptor;
  • $\sum$ is the sum of the components of a vector;
  • $\times$ is the component-wise product of two vectors;
  • $\bar{d_i}$ is the mean value of $d_i$;
  • $\sigma_{d_i}$ is the standard deviation of $d_i$;
  • $\epsilon$ is a very small value ($\ll 1$) to avoid instability when $\sigma_{d_i} = 0$ (this may happens with buggy patches with constant values).

This simply compares vectors whose values are shifted around $0$ and scaled.

The result is close to $1$ for vectors which are highly colinears.

work

Complete the function below to compute a normalized cross correlation between descriptors.

Tip: Check Numpy documentation for your np.arrays to find useful operations like array.mean() or np.sum().

In [5]:
# TODO complete this function
def ncc(v1, v2, epsilon=10e-6):
    '''Computes the normalized cross correlation between two vectors.'''
    n = len(v1)
    if n != len(v2):
        raise ValueError("v1 and v2 must have the same len."
                         "I got len(v1)=%d and len(v2)=%d" % (n, len(v2)))
    ncc_value = -1.  # FIXME
    return ncc_value
In [6]:
# prof
def ncc(v1, v2, epsilon=10e-6):
    '''Computes the normalized cross correlation between two vectors.'''
    n = len(v1)
    if n != len(v2):
        raise ValueError("v1 and v2 must have the same len."
                         "I got len(v1)=%d and len(v2)=%d" % (n, len(v2)))
    v1_ = (v1 - v1.mean()) / (v1.std() + epsilon)
    v2_ = (v2 - v2.mean()) / (v2.std() + epsilon)
    ncc_value = np.sum(v1_ * v2_) / n
    return ncc_value
In [7]:
# RUN ME
# Some tests to help you
print(ncc(np.arange(10), np.arange(10,20)))
print(ncc(np.arange(10), np.arange(20,10,-1)))
# Should print
# 0.9999930369301252
# -0.9999930369301252
0.9999930369301252
-0.9999930369301252
work

Complete the functions below to compute matches between descriptors.

Tips:

  • Both functions returns the indices of the descriptors from desc2 which are the best match to the descriptors from desc1, or -1 if no suitable match is found.
  • Test the matching without injecting your normalized cross correlation at first.
  • np.argsort(a) gives the indices which sort a.
  • np.nonzero(bool_array) gives the indices where bool_array is True.
In [8]:
# TODO complete this function
def match(desc1, desc2, threshold=0.5):
    """ For each descriptor in the first set, 
        select its best match in the second set
        using normalized cross correlation.
        
        Warning: desc1 and desc2 are two lists of 1D numpy arrays.
        --
        Returns a list of the same size as desc1
        where elements are either an indice from descr2
        or -1 otherwise.
        """
    
    if len(desc1) == 0:
        return np.array([])
    if len(desc2) == 0:
        return np.full(len(desc1), -1)
    
    bestmatches = np.full(len(desc1), -1)  # FIXME
    
    return bestmatches
In [9]:
# TODO complete this function
def match_twosided(desc1, desc2, threshold=0.5):
    """ Two-sided symmetric version of match().
        --
        Returns a list of the same size as desc1
        where elements are either an indice from descr2
        when symmetric match is verified,
        or -1 otherwise.
    """
    # Compute the matches
    # FIXME
    # matches_12 = ...
    # matches_21 = ...
    
    # remove matches that are not symmetric
    # FIXME
    
    return np.full(len(desc1), -1)  # FIXME
In [10]:
# prof
def match(desc1, desc2, threshold=0.5):
    """ For each descriptor in the first set, 
        select its best match in the second set
        using normalized cross correlation.
        --
        Returns a list of the same size as desc1
        where elements are either an indice from descr2
        or -1 otherwise.
        """
    if len(desc1) == 0:
        return np.array([])
    if len(desc2) == 0:
        return np.full(len(desc1), -1)
    
    bestmatches = []
    for i in range(len(desc1)):
        best_j = -1
        best_val = -1
        for j in range(len(desc2)):
            ncc_value = ncc(desc1[i], desc2[j])
            if ncc_value > threshold and ncc_value > best_val:
                best_j = j
                best_val = ncc_value
        bestmatches.append(best_j)
        
    return np.array(bestmatches)

def match_twosided(desc1, desc2, threshold=0.5):
    """ Two-sided symmetric version of match().
        --
        Returns a list of the same size as desc1
        where elements are either an indice from descr2
        when symmetric match is verified,
        or -1 otherwise.
    """
    
    matches_12 = match(desc1,desc2,threshold)
    matches_21 = match(desc2,desc1,threshold)
    
    # indices of the elements which are actual matches
    ndx_12 = np.nonzero(matches_12 >= 0)[0]
    
    # remove matches that are not symmetric
    for n in ndx_12:
        if matches_21[matches_12[n]] != n:
            matches_12[n] = -1
    
    return matches_12
In [11]:
# prof
# the version with the normalized sum of squared differences
# bad results

# def match(desc1, desc2, threshold=0.5):
#     """ For each descriptor in the first set, 
#         select its best match in the second set
#         using normalized sum of squared differences.
#         --
#         Returns a list of the same size as desc1
#         where elements are either an indice from descr2
#         or -1 otherwise.
#         """
#     if len(desc1) == 0:
#         return np.array([])
#     if len(desc2) == 0:
#         return np.full(len(desc1), -1)
    
#     bestmatches = []
#     for i in range(len(desc1)):
#         best_j = -1
#         best_val = 1
#         for j in range(len(desc2)):
#             # norm. ssd
#             dist = (np.sum((desc1[i] - desc2[j]))**2 
#                     / np.sqrt((np.sum(desc1[i])**2) * (np.sum(desc2[j])**2)))
#             if dist < threshold and dist < best_val:
#                 best_j = j
#                 best_val = dist
#         bestmatches.append(best_j)
        
#     return np.array(bestmatches)

2. Reload everything and match some bubbles¶

We are now ready to match descriptors for some bubbles!

We will compare some bubbles using the descriptors we previously computed.

work

Reload everything we need to match some bubbles, and solve the problem.

We need:

  • bubble images (color and grayscale),
  • the distance matrix between bubbles computed using color histograms,
  • the keypoints coordinates and descriptors we computed previously.
In [12]:
# load everything we need
# TODO adapt this code if you want to use other values

bubble_files = !ls $PATH_TO_RESOURCES/bubbles_200dpi/b*.png | sort
print(bubble_files[:3])

bubbles = [cv2.imread(ff) for ff in bubble_files]
bubbles_gray = [cv2.cvtColor(bb, cv2.COLOR_BGR2GRAY) for bb in bubbles]

# previously computed distance matrix
npdata = np.load(PATH_TO_RESOURCES + "/bubble_dist_mat_rgb7-cosine.npz", allow_pickle=True)
dist_mat = npdata["dist_mat"]

# previously computed keypoints and descriptors
npdata = np.load(PATH_TO_RESOURCES + "/kpts_descr_harris_25pxcolor_mdist10.npz", allow_pickle=True)
keypoints = npdata["keypoints"]
descriptors = npdata["descriptors"]

del npdata
len(bubbles), len(bubbles_gray), dist_mat.shape, len(keypoints), len(descriptors)
['/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b001.png', '/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b002.png', '/home/jchazalo/git/jchazalo/cours-mlrf-preparation/resources/twin_it/bubbles_200dpi/b003.png']
Out[12]:
(391, 391, (391, 391), 391, 391)
work

Using the display function provided below, compute and display some matches between a couple of bubbles.

Tips:

  • Bubbles with indices 35 and 219 are good candidates. So are bubbles 49 and 278.
  • Try to find a good value for the threshold.
In [13]:
# Display functions
def appendimages(im1, im2):
    """ Return a new image that appends the two images side-by-side. """
    # select the image with the fewest rows and fill in enough empty rows
    rows1 = im1.shape[0]    
    rows2 = im2.shape[0]
    if rows1 < rows2:
        im1 = np.concatenate((im1, np.zeros((rows2-rows1,im1.shape[1]))),axis=0)
    elif rows1 > rows2:
        im2 = np.concatenate((im2, np.zeros((rows1-rows2,im2.shape[1]))),axis=0)
    # if none of these cases they are equal, no filling needed.
    return np.concatenate((im1,im2), axis=1)

def plot_matches(im_gray1, im_gray2, locs1, locs2, matches, show_below=True):
    """ Show a figure with lines joining the accepted matches 
        input: im_gray1,im_gray2 (images as arrays),
        locs1,locs2 (feature locations, aka keypoints), 
        matches (as output from 'match()'), 
        show_below (if images should be shown below matches). """
    if im_gray1.ndim != 2 or im_gray2.ndim != 2:
        raise ValueError("plot_matches takes gray images (ndim == 2) as arguments."
                         " I got im_gray1.ndim = %d and im_gray2.ndim = %d" 
                         % (im_gray1.ndim, im_gray2.ndim))
    im3 = appendimages(im_gray1, im_gray2)
    if show_below:
        im3 = np.vstack((im3,im3))
    plt.figure()
    plt.imshow(im3, cmap='gray')
    cols1 = im_gray1.shape[1]
    for i,m in enumerate(matches):
        if m >= 0:
            plt.plot([locs1[i][1],locs2[m][1]+cols1],[locs1[i][0],locs2[m][0]],'r')
    plt.axis('off')
    plt.show()
In [15]:
# Run me!
thres = 0.9
def compute_plot_matches(bid1, bid2, sym=False):
    match_fun = match_twosided if sym else match
    print("%d %s %d" % (bid1, "⇔" if sym else "→", bid2))
    matches = match_fun(descriptors[bid1], descriptors[bid2], threshold=thres)
    print('%d matches / %d descr.' % (np.count_nonzero(matches >= 0), len(matches)))
    plot_matches(bubbles_gray[bid1], bubbles_gray[bid2],
                 keypoints[bid1], keypoints[bid2],
                 matches)

compute_plot_matches(35, 219)
compute_plot_matches(219, 35)
compute_plot_matches(35, 219, True)
compute_plot_matches(35, 36)
compute_plot_matches(36, 35)
compute_plot_matches(35, 36, True)
compute_plot_matches(49, 278)
compute_plot_matches(278, 49)
compute_plot_matches(49, 278, True)
35 → 219
22 matches / 30 descr.
219 → 35
19 matches / 30 descr.
35 ⇔ 219
10 matches / 30 descr.
35 → 36
0 matches / 30 descr.
36 → 35
0 matches / 30 descr.
35 ⇔ 36
0 matches / 30 descr.
49 → 278
5 matches / 30 descr.
278 → 49
5 matches / 30 descr.
49 ⇔ 278
4 matches / 30 descr.
work

Write down some observations about the previous matchings. What are the limitations of our approach?

What are the limitations of the matching we implemented?

TODO

(PROF) There are at least 2 limitations:

  1. We need a threshold (this is hard to get ride of).
  2. The symmetric match filters too many keypoints: because of the repetitive texture, the second, third, etc. best matches may be acceptable too but only the best one is kept. If would be better to:
  • look for a maximal coupling,
  • check for geometrical consistency.

3. Solve Twin it!¶

At last we can try to filter bubbles more efficiently.

We will first pre-select the bubbles using the distance matrix computed using color histograms, then we will further filter this pre-selection using desriptor matching. Then, we will be able to count the number of matches to select best candidates.

work

Try to display only bubbles with twins. (Try to minimize the amount of human control.)

Tips:

  • For each bubble, display best candidates (if any).
  • Keep only a few (5 or so) candidates using the distance matrix computed on color histograms.
  • Use a restrictive threshold for descriptor matching (correlation > $0.9$).
  • Use the count of matches to make a decision.
  • Here are a few bubble ids to check if you do not have the time to run all the computation: [0, 1, 35, 36, 43, 44, 49, 50, 91, 92, 105, 106].
In [16]:
# TODO solve twin it!
In [17]:
%%time
# prof
def imshow_raw(imlist, columns=5):
    plt.figure(figsize=(10,10))
    for ii, image in enumerate(imlist):
        plt.subplot((len(imlist) + columns - 1) // columns, columns, ii+1)
        plt.imshow(image)
    plt.show()

def imshow_bgr(imlist, columns=5):
    return imshow_raw([cv2.cvtColor(image, cv2.COLOR_BGR2RGB) for image in imlist], columns)

max_res = 5
thres = 0.9
# sort the distance matrix to get best candidates
best_matches_idx_perrow = np.argsort(dist_mat, axis=1)
# iterate over bubbles
# for ii in [0, 1, 35, 36, 43, 44, 49, 50, 91, 92, 105, 106]: 
for ii in range(len(bubbles)):
    str_to_print = "%d (%d kpts) - "%(ii, len(keypoints[ii]))
    bb_to_display = [bubbles[ii]]
    
    candidates_id = best_matches_idx_perrow[ii, 0:max_res]
    
    # Symmetric version
    matches = [match_twosided(descriptors[ii], descriptors[jj], threshold=thres) 
               for jj in candidates_id]
    mcounts = np.array([np.count_nonzero(np.array(m) != -1) for m in matches])
    mtotal = np.array([len(m) for m in matches])
    score = np.nan_to_num(mcounts / mtotal)
    
    # Display the best matches in descending order
    order = np.argsort(-score)
#     order = np.argsort(-mcounts)
    for jj in range(max_res):
        bb_idx = candidates_id[order[jj]]
        mc = mcounts[order[jj]]
        mt = mtotal[order[jj]]
        if mc > 0:
            bb_to_display.append(bubbles[bb_idx])
            str_to_print += "%i:%d/%d " % (bb_idx, mc, mt)
       
    if len(bb_to_display) > 1:
        print(str_to_print)
        imshow_bgr(bb_to_display, columns=max_res+1)
2 (29 kpts) - 369:1/29 
8 (8 kpts) - 193:1/8 
17 (30 kpts) - 215:1/30 83:1/30 
35 (30 kpts) - 219:10/30 
43 (22 kpts) - 347:8/12 
45 (30 kpts) - 73:1/30 
48 (30 kpts) - 232:1/29 
49 (30 kpts) - 278:4/30 
51 (21 kpts) - 301:1/21 
58 (12 kpts) - 117:1/12 
60 (30 kpts) - 215:1/30 
69 (30 kpts) - 332:1/24 
73 (28 kpts) - 45:1/14 
79 (30 kpts) - 324:5/30 300:3/30 169:1/30 
83 (30 kpts) - 87:3/30 17:1/30 167:1/30 
85 (30 kpts) - 215:2/30 
87 (30 kpts) - 83:3/30 
91 (23 kpts) - 230:15/23 
92 (30 kpts) - 388:2/29 
105 (17 kpts) - 229:14/17 164:1/17 
117 (28 kpts) - 58:1/28 
153 (30 kpts) - 294:1/18 
155 (30 kpts) - 190:1/27 
164 (10 kpts) - 229:1/10 105:1/10 
167 (30 kpts) - 83:1/30 
169 (30 kpts) - 324:2/28 79:1/28 
172 (30 kpts) - 339:9/30 
176 (30 kpts) - 229:1/25 
187 (10 kpts) - 212:3/10 190:3/10 
190 (30 kpts) - 187:3/30 212:1/30 155:1/30 
191 (30 kpts) - 300:1/23 
193 (30 kpts) - 8:1/30 
198 (15 kpts) - 322:1/15 
212 (30 kpts) - 187:3/29 190:1/29 
215 (30 kpts) - 85:2/30 60:1/30 17:1/30 
219 (30 kpts) - 35:10/30 
<timed exec>:29: RuntimeWarning: invalid value encountered in true_divide
229 (16 kpts) - 105:14/16 164:1/16 
230 (25 kpts) - 91:15/25 344:1/25 
232 (15 kpts) - 48:1/15 
278 (30 kpts) - 49:4/30 
279 (30 kpts) - 300:2/27 290:2/27 
280 (30 kpts) - 215:7/30 290:1/30 
290 (30 kpts) - 280:1/30 
294 (11 kpts) - 153:1/11 
299 (30 kpts) - 339:1/24 
300 (30 kpts) - 324:3/30 191:1/30 
301 (30 kpts) - 51:1/30 
312 (29 kpts) - 51:1/27 
324 (30 kpts) - 79:5/24 300:3/24 169:2/24 
332 (30 kpts) - 69:1/30 
339 (30 kpts) - 172:9/30 299:1/30 
344 (30 kpts) - 45:3/30 230:1/30 91:1/30 
369 (10 kpts) - 2:1/10 
388 (30 kpts) - 92:2/22 
CPU times: user 4min 46s, sys: 98.7 ms, total: 4min 46s
Wall time: 4min 46s

Job done!¶

You completed the session 2.

You should write down some observations below, like what are the parameters we tuned and how.

work

Write some observations below.

Tips:

  • What are the parameters we tuned?
  • Are there other parameters in our method?

TODO write some observations.

We can play with

  • the size of the descriptors
  • their spacing
  • the matching strategy
  • the matching threshold
  • the number of candidates we keep based on color histogram comparison

and probably many other things.

Note that it is hard to recover all the twins without recovering some noise, in a general case. We have an approximate precision of 30% in the result above, but a recall of 100%.

Using the color histogram to pre-filter results really speeds things up!