EPITA 2021 MLRF practice_01-02_numpy v2021-05-17_160644 by Joseph CHAZALON

Creative Commons License This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

Practice session 1 part 2: NumPy-Fu

Make sure you read and understand everything, and complete all the required actions. Required actions are preceded by the following sign: Back to work!

Preliminary checks

Perform a couple checks…

Import the required modules

Notice the line magic used to configure how matplotlib output is rendered.

NumPy crash course

NumPy allows you to manipulate n-dimensional arrays (representing matrices, tensors, images…) with a very simple syntax.

Array creation (1/3)

Here are some examples of array creation:

stop The previous object was created but contains very strange content…

shape and dtype

Shape and content (data) type are two very important properties to check for arrays.

Array creation (2/3)

work **Now try to create some arrays of different types (integers, floating-point numbers, booleans, complex numbers) and shapes.**

Do not hesitate to check:

work **Now check the [documentation about array creation](https://docs.scipy.org/doc/numpy-1.16.1/reference/routines.array-creation.html) and try some other array creation routines.**

We recommand that you have a look at:

A very important thing to note with NumPy is that native routines make use of optimized C code which is orders of magnitude faster than Python loops.

You should always try to avoid writing Python loops to access NumPy arrays, and you should rather try to find a native routine which does the task you are looking for.

work **Benchmark the initialization time of some big array using a native routine vs using a `for` loop.** **Make sure you understand the differences between `%time`, `%%time`, `%timeit` and `%%timeit`.**

Array creation (3/3)

There are other very useful array creation routines to be aware of. Among my favorites are arange and linspace.

work **Use iPython's magic `?` to display the documentation for each of those, and create two small arrays.**

Reshaping

It is easy to change the shape of an array, as long as the new shape is compatible with the original one.

work **Use `reshape(shape)` to give a new shape with 3 dimensions to this array `a`.**

Apply operations on arrays

All the power of NumPy lies in how we apply operations on arrays. We can apply operations in 3 different ways:

  1. First as array methods like this:
    a = np.arange(3)
    a.max()
    
    This technique is useful for operation which consider only the current array.
  1. Second by calling a NumPy operation on the array like this:

    a = np.linspace(0, 1, 10)
    np.cos(a)
    

    This second technique is more suitable for mathematical operations which are not directly available as methods, and return an array of the same shape.

  2. Third simply by calling natural operations extended to arrays like this:

    a = np.arange(0, 3)
    b = np.arange(3, 6)
    a + b
    
work **Experiment a couple of operations on arrays.**

Indexing: access elements

You can also access individual values of arrays using advanced slicing techniques:

We can specify slices for each dimension.

We can select multiple values using sequences of indexes, mixing basic and advanced slicing and indexing.

We can even add new axis on the fly:

Note that np.newaxis is actually None, so you it is common to use None directly.

And you can create masks and apply them. This is very powerful!

work **Try to extract even numbers in the following `a` array.**

Make sure to read at least once in your life (no during this session though) the page about NumPy indexing.

Broadcasting

Broadcasting is a very powerful concept in NumPy, and maybe its greatest strength. However, it takes times to master it and even then you sometimes get surprised.

According to the official documentation:

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python.

It is easy to make use of broadcasting:

Let's have a look at some examples now.

First NumPy operations are usually done element-by-element which requires two arrays to have exactly the same shape:

NumPy’s broadcasting rule relaxes this constraint when the arrays’ shapes meet certain constraints. The simplest broadcasting example occurs when an array and a scalar value are combined in an operation:

The broadcasting applied in the previous example virtually "streches" b to match a's shape. This can be illustrated by the following figure: numpy broadcasting 1

The rule governing whether two arrays have compatible shapes for broadcasting can be expressed in a single sentence.

The Broadcasting Rule:

In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.

Here are more examples (taken from the documentation, again):

A two dimensional array multiplied by a one dimensional array results in broadcasting if number of 1-d array elements matches the number of 2-d array columns. numpy broadcast2

However, when the trailing dimensions of the arrays are unequal, broadcasting fails because it is impossible to align the values in the rows of the 1st array with the elements of the 2nd arrays for element-by-element addition. numpy broadcast fail

The following example shows an outer addition operation of two 1-d arrays that produces the same result as the previous (working) example. Here the newaxis index operator inserts a new axis into a, making it a two-dimensional 4x1 array.

The following figure illustrates the stretching of both arrays to produce the desired 4x3 output array. numpy broadcast 4

work **Display the shape of a when we add it a new axis like in the previous example.**

Apply an operation along an axis

Most of the aggregation function allow you to specify the axis along which the computation will be performed. axis=0 means the first axis, axis=i means the $i+1$ axis, axis=-1 means the last axis.

This allows, for example, to compute the warmest month for each city (or the warmest city for each month).

work **Display the warmest month for each city. Use the `argmax` operation on `data` with appropriate `axis` parameter.**
work **Display the warmest city for each month. Use the `argmax` operation on `data` with appropriate `axis` parameter.**

Gluing arrays together

You can "glue" arrays together as long as their shape is compatible.

Copies vs views

Array indexing may not copy the memory but returns a view instead. In this case, changing the view changes the original array. Make sure to make a copy of the original array, or of the view's underlying data, if you do not want to use the same object twice!

The simplest case is when a reference is copied (either during assignment or during a function call).

You can use the copy() method to perform a deep copy of some array.

work **Use `copy()` to copy `a` values into `b`, then update `b` without changing `a`.**

Slicing an array returns a view of it!

Linear algebra and other NumPy tools

Just for the record, NumPy also contains many linera algebra and other useful routines for statistics, mathematics, random sampling, etc.

You'll discover them progressively.

Matplotlib survival guide

You can plot data using the simple stateful plt interface. You start by creating a figure with

plt.figure()

then you plot some data, plots are added to the current figure:

plt.plot([0, 1, 2, 3], [1, 3, 5, 7])
plt.plot([0, 1, 2, 3], [2, 4, 6, 8])

and finally you call the rendering function:

plt.show()

Here is a more complete example you will be able to reuse:

And another one showing two images in two different subfigures.

Another example with an histogram.

There are many possible graph types, and many options to configure colors, legends, markers, to add annotations, etc. You will discover them by practicing and by looking at examples.

Let's just finish this very quick introduction to Matplotlib by pointing out useful resources:

Job done!

Great! Now you're ready to move on to the next stage: Image manipulations.