EPITA 2021 MLRF practice_01-04_twinit-part1 v2021-05-17_160644 by Joseph CHAZALON

Creative Commons License This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

Practice session 1 part 4: First steps with Twin it!

Make sure you read and understand everything, and complete all the required actions. Required actions are preceded by the following sign: Back to work!

Perform a couple checks…

Import all the modules we are going to use.

1. About Twin it!

Twin it! is a poster game with many "bubbles". They are all different but a few pairs. The goal is to find the pairs.

All artwork is copyrighted by the original author, Thomas Vuarchex.

We won't tell you, at first, how many bubble they are nor how many matching pairs they are.

Here is a downsampled (compressed — don't use it) version of the original poster. Twin it!

2. Simple template matching

To get started, we will use the simplest available solution:

  1. manually select a pattern (the template),
  2. look for a similar patterns within the image,
  3. display the matching areas.

2.1. Load the image

But first, you need the original image. It is available in the twin_it.tar.gz file, at the following path: "twin_it/twin_it_200dpi.png". We also provided a downsampled version: "twin_it/twin_it_50dpi.png".

You may want to resize your base image using:

work **Read the image of your choice and display it.**

2.3. Select some template

You can manually select a template using NumPy slicing, as you already know.

work **Select a pattern which looks like the following image.** **Display it to control what you have just selected.** template *Tip: You may want to crop a part **inside** the bubble for better results.*

2.4. Template matching

We will look for areas in the image which look like the current template. As we previously said, we will use a very basic technique which is based on a simple difference computation:

We will use OpenCV matchTemplate() function because it provides more variants than the match_template() equivalent of scikit-image — which has a better padding management though…

You way want to have a look at those two tutorials:

The available methods for OpenCV's matchTemplate() are ($R$ is the result image, $I$ is the base image, $T$ is the template, and $(r,c)$ are coordinates):

We will briefly explain those equations during the session.

For now, you just have to note two points.

  1. The resulting image has the same shape as the base image, and the value at each pixel indicates how well the template matches with the portion of the image covered when we "overlay" the pattern by aligning its upper-left corner with the current pixel.
  2. For the squared difference, a lower value indicates a better match, but for the correlation-based ones its a higher value which indicates a better match.

Let us now compare these methods.

work **Compute and display the matching "maps" between the base image and the previously selected template, using each of those methods.**

2.5. Display the best matches

We now need to locate area with the best match. We can use two methods to get the coordinates of such pixel:

We will need to remove the region around the original patch to avoid find the exact same patch!

work **For each method display the matching area (cropped) in the original image and its position in the corresponding map, with and without removing the region around the original patch in the image.**

3. Time to step back

OK, we have started to work on something. We are the point where we may have ideas about things to try: use grayscale images instead of color images, iterate over the best matches to look for a relevant match, suppress local maximums close to another local maximum…

But wait.

First, let us "save" where we are.

work **Write down some notes summarizing your first observations: What have you done? What were the results you expected? Did you results you obtained match you expectations?** *Note: It is very important to force yourself to keep a journal of your experiments.*

Write somes notes here.

(prof)

We tried several pattern matching techniques. We manually selected a pattern and checked whether those methods were capable to find it in the image. Except the method CCORR, they all are.

If we remove the area around the patch in the original image, replacing it by black pixels, and try to look for similar patterns, then we obtain an approximate location of another similar patch.

Criticize the approach

Now we have kept track of what we are doing, we can try to think a bit more about what we are actually doing.

work **Write down some critics about our approach: How confident can we be about the results we obtained? Can we draw some solid conclusions about the performance of the methods we are comparing?** *Note: The worst bias in Machine Learning is the designer's optimism. We all want our method to perform well and would be tempted to stop at the first promising results, reporting good news to a customer: "It works!". This is a terrible trap.*

Write some critics here.

(prof)

While it was fun to start trials right away, we do not have a sound experimental setup:

PS: and for the CCORR approach, we should substract the mean value of the template to itself to avoid matching areas with higher intensities in the image.

4. A sound(er) approach

We will now set up an experimental setup which will be useful for the next session.

4.1. Experiment-driven research

This is what you should always do first, to perform experiment-driven research — much alike test-driven development:

  1. define the task in a testable way,
  2. ensure you have some data and associated ground truth,
  3. implement the evaluation framework.

To complete this session, we will now start over and make sure we have all the necessary pre-requisites to perform sound experiments.

To facilitate your work, we already segmented all the bubbles and gave them an id. If we have time, we'll discuss how we did that and/or show you the code (in two words: simple thresholding and two connected components labellings). You can now use images likes those:

b002 b003 b004 b005 b006

All the files are located here: twin_it/bubbles_200dpi/bNNN.png where NNN is a zero-padded number between 001 and 391.

We will also help you by telling you that those two bubbles (b044 and b092) have twins (there are more twins though!):

b044 b092

and that those two bubbles (b001 and b096) do not have twins:

b001 b096

You will now try to match pairs of isolated bubbles. This removes the risk of matching in-between content and allows us to have a precise and simple evaluation: for a given test bubble, we should know what is the twin bubble, if there is any.

4.2. Image borders and padding

You will need a little trick to be able to perform pattern matching on images which have approximately the same size: you will have to pad the base image (one of the two bubble being matched) in order to cope with possible texture translation. You can use the function cv2.copyMakeBorder(). The goal is to ensure that the base image has borders large enough to contain the largest patterns (both horizontally and vertically).

As image border is an important concept, we will illustrate the possible borders values before going further. This section is copied from the original OpenCV documentation.

The function cv2.copyMakeBorder() takes following arguments:

Below is a sample code demonstrating all these border types for better understanding:

import cv2
import numpy as np
from matplotlib import pyplot as plt
BLUE = [255,0,0]
img1 = cv2.imread('opencv-logo.png')
replicate = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_WRAP)
constant= cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)
plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')
plt.show()

This produces the result below. (Image is displayed with matplotlib. So RED and BLUE channels will be interchanged): Borders

4.3. Read all the bubble images

We can make use of Jupyter's magic to quickly get a list of all bubble images, and load them all.

work **Read all bubble images. They are located under `twin_it/bubbles_200dpi/b???.png`. You may want to resize them using `cv2.resize()`.**

4.4. Prepare the test images

It is more efficient to consider the bubble under test as the base image, and pad it once for all.

work **Identity the test images (declare a separate list) and prepare them.** *Hint: compute the largest height and width for the bubbles to pad the images.*

4.5. Match test bubbles against the others

We can now compute the matching of each bubble against the others, with the method of our choice.

work **For all the methods, for all 4 test images, compute the matching distance or score and display the query along with the 5 best matches and their associated values.** *Optional: Skip the bubble with the same id.* *Hint: Use `np.argsort` to get the ids of the best matches.* *Hint2: Crop the template to keep its central area and remove most of black surroundings (this yields better results).*
work If you do not manage to perform those computations, or if you have time to take a broader look at the results, you can still inspect the file `twin_it/dist_mat_sqdiff_normed.npz` to get the (triangular) matrix of squared differences between **all** patches.

4.6. Analyze the results

We now have enough information to draw some conclusions about this approach.

work **Conclude this first session by writin down some quick answers to the following quetions:** - **Can we draw conclusions about the performance of each method?** - **Can we be confident about the results we obtain?** - **Are there interesting phenomemons in the results?** - **It is still possible to solve the current problem?** - **What are the limits of our approach?**

TODO write your answers here

(prof)

Can we draw conclusions about the performance of each method?

First, for each method, the normalized version seems to work better.

Can we be confident about the results we obtain?

While the first result for each bubble with a twin is relevant, is may not be possible to detect the bubble with twins just by looking at the best value because the value associated with the best match of a bubble without twin may be better than the value associated with the best match of a bubble with twin.

Furthermore, we only tested for a few images and have no guarantee about the generalization power of our approach.

Are there interesting phenomemons in the results?

Several bubbles are present in the best-matching results for all queries. Such "attractors" are very common with approaches like this one, where elements with average color or texture get matched with many other elements, and pollute the results.

It is still possible to solve the current problem?

Given our results, we may not directly obtain the matching pairs but presenting the best matches for each bubble can significantly reduce the human time needed to solve the problem.

What are the limits of our approach?

There are many possible improvements of the current approach.

First it is common for pattern matching techniques to substract the average gray value of an image before matching parts of it. In our case the average gray value may not match the average gray value of each bubble, and we should mask the background. However this would be an important point to test.

Also, regarding the values produces by the matching, we could try to normalize the values in the big similarity matrix we can compute over each pair of bubbles. We could also pre-filter bubbles according to their main colors.

Despite those possible improvements, this approach would still limited to detecting matching textures with very little challenges: here we can only cope with translation and some slight noise and illumination variation. More robust techniques are necessary to tackle more challenging cases, like ones with strong illumination changes, rotations, scaling, perspective, noise, etc.

Finally, the current approach is very slow and cannot be used with thousands of images.

Job done!

Congratulations, you just reached the end of this session!