Tips and Traps¶

The broadcast concept in numpy is essentially a way to "virtually" duplicate data in a numpy array so that it is "virtually" reshaped to be compatible with another numpy array for a certain operation. Do not confused yourself about it with the broadcast concept in Spark which sends a full copy of a (small) DataFrame to each work node for BroadCastJoin.
numpy.expand_dims expands the shape of an array and returns a veiw (no copy is made). It is very useful to help broadcasting arrays.

numpy.expand_dims ¶

numpy.expand_dims returns a view (no copy is made) to the expanded array. The example below illustrate this.

Create a 2-d array a1.

In [37]:

a1 = np.array([[1, 2, 3], [4, 5, 6]])
a1

Out[37]:

array([[1, 2, 3],
       [4, 5, 6]])

Expand the dimension of a1 to be (2, 3, 1).

In [39]:

a2 = np.expand_dims(a1, axis=2)
a2.shape

Out[39]:

(2, 3, 1)

In [40]:

a2[:, :, 0]

Out[40]:

array([[1, 2, 3],
       [4, 5, 6]])

Update an element of a2.

In [42]:

a2[0, 0, 0] = 1000

Notice that a1 is updated too.

In [43]:

a1

Out[43]:

array([[1000,    2,    3],
       [   4,    5,    6]])

numpy.reshape and numpy.ndarray.reshape ¶

Both functions reshape the dimension of an array without changing the data. A view (instead of a copy) of the original array is returned. The example below illustrates this.

Create a 2-d array a1.

In [44]:

a1 = np.array([[1, 2, 3], [4, 5, 6]])
a1

Out[44]:

array([[1, 2, 3],
       [4, 5, 6]])

Reshape the array a1.

In [45]:

a2 = a1.reshape((2, 3, 1))
a2.shape

Out[45]:

(2, 3, 1)

In [46]:

a2[:, :, 0]

Out[46]:

array([[1, 2, 3],
       [4, 5, 6]])

Update an element of a2.

You can pass the shape parameters as individual parameters instead of passing it as a tuple.

In [50]:

a1.reshape(2, 3, 1)

Out[50]:

array([[[1000],
        [   2],
        [   3]],

       [[   4],
        [   5],
        [   6]]])

In [47]:

a2[0, 0, 0] = 1000

Notice that a1 is updated too.

In [48]:

a1

Out[48]:

array([[1000,    2,    3],
       [   4,    5,    6]])

Use numpy.expand_dims to Help Broadcast Arrays¶

All of numpy.expand_dims, numpy.reshape and numpy.array.reshape can be used to reshape an array to help broadcasting. The below illustrates how to use numpy.expand_dims to help broadcast numpy arrays using an example of manipulating images.

Read in an image.

In [1]:

!wget https://user-images.githubusercontent.com/824507/128439087-0c935d86-bb34-4c2c-8e69-6d78b3022833.png -O 4s.jpg

--2021-08-05 17:48:59--  https://user-images.githubusercontent.com/824507/128439087-0c935d86-bb34-4c2c-8e69-6d78b3022833.png
Resolving user-images.githubusercontent.com (user-images.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...
Connecting to user-images.githubusercontent.com (user-images.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4588 (4.5K) [image/png]
Saving to: ‘4s.jpg’

4s.jpg              100%[===================>]   4.48K  --.-KB/s    in 0s      

2021-08-05 17:48:59 (17.2 MB/s) - ‘4s.jpg’ saved [4588/4588]

In [2]:

from PIL import Image
import numpy as np

img = Image.open("4s.jpg")
img

Out[2]:

Convert the image to a numpy array.

In [34]:

arr = np.array(img)
arr.shape

Out[34]:

(54, 37, 3)

Get the sum of channels.

In [12]:

channel_sum = arr.sum(axis=2, dtype=np.float32) + 0.01
channel_sum.shape

Out[12]:

(54, 37)

Now suppose we want to calculate the ratio of each channel to this sum. It won't work if we use arr / channel_sum as the dimensions of the 2 arrays are not compatible for broadcasting. One solution is to expand the dimension of channel_sum to (54, 37, 1) which is compatible for broadcasting with arr. Notice that numpy.expand_dims returns a view (no copy is made) of the dim-expanded array.

In [14]:

np.expand_dims(channel_sum, axis=2).shape

Out[14]:

(54, 37, 1)

In [16]:

ratios = arr / np.expand_dims(channel_sum, axis=2)
ratios.shape

Out[16]:

(54, 37, 3)

If the values of the 3 channes are close enough (by comparing the max/min values of the ratios), make the corresponding pixles white.

In [35]:

ratio_max = ratios.max(axis=2)
ratio_min = ratios.min(axis=2)
mask = (ratio_max - ratio_min) < 0.35
arr[mask, :] = 255

In [36]:

Image.fromarray(arr)

Out[36]:

Notice that the slight shading effect in the original picture is removed.

References¶

Array Broadcasting

In [ ]:

Ben Chuanlong Du's Blog

It is never too late to learn.

Broadcast Arrays in Numpy

Tips and Traps¶

numpy.broadcast_to ¶

numpy.expand_dims ¶

numpy.reshape and numpy.ndarray.reshape ¶

Use numpy.expand_dims to Help Broadcast Arrays¶

References¶

Comments

Tips and Traps¶

numpy.broadcast_to¶

numpy.expand_dims¶

numpy.reshape and numpy.ndarray.reshape¶

Use numpy.expand_dims to Help Broadcast Arrays¶

References¶

Comments

numpy.broadcast_to ¶

numpy.expand_dims ¶

numpy.reshape and numpy.ndarray.reshape ¶