ArchNN_Lab1.ipynb
ArchNN_Lab1.ipynb
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "MQQj5AcGHsKF"
},
"source": [
"# **Creating CNN from scratch**\n",
"\n",
"In this assignment, you will implement convolutional and pooling layers in
Python using NumPy, including both forward propagation and backward propagation.\
n",
"\n",
"This notebook contains 7 tasks. For easy navigation, you can use the table
of contents."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Q-UzmoOXH1bl"
},
"source": [
"**After completing this homework, you will be able to:**\n",
"\n",
"* explain how the convolution operation works;\n",
"* identify the components used in a convolutional neural network (kernel,
stride, padding, ...) and their purpose;\n",
"* apply different types of pooling operation;\n",
"* build a main part of the convolutional neural network almost from
scratch.\n",
"\n",
"**Prerequisites:** `NumPy` basics, linear algebra basics, understanding of
the main components of the CNN."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qgZnYvmuCMLi"
},
"source": [
"## **Notebook's table of contents**\n",
"\n",
"* [Reminder of the CNN structure](#0)\n",
"1. [Importing packages](#1)\n",
"2. [Building convolutional layer](#2)\n",
" * [Padding operation](#2.1)\n",
" * [Task 1](#2.1.1)\n",
" * [Task 2](#2.1.2)\n",
" * [Single step convolution](#2.2)\n",
" * [Task 3](#2.2.1)\n",
" * [Convolutional layer - forward pass](#2.3)\n",
" * [Task 4](#2.3.1)\n",
"3. [Activation function](#3)\n",
" * [Task 5](#3.1)\n",
"4. [Building pooling layer](#4)\n",
" * [Task 6](#4.1)\n",
"5. [Backpropogation in convolutional layer](#5)\n",
" * [Task 7](#5.1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "y-AwK4pIRXSk"
},
"source": [
"<a name='0'></a>\n",
"## **Reminder of the CNN structure**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hOG4VPZ7KIBj"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6etI766HKYKS"
},
"source": [
"<a name='1'></a>\n",
"## **1. Importing packages**\n",
"\n",
"In this assignment you will use the following packages:\n",
"* [NumPy](http://numpy.org) package — fundamental package for efficient
scientific computing with Python;\n",
"* [Matplotlib](http://matplotlib.org) package — one of the most popular
libraries to plot graphs in Python.\n",
"\n",
"Please note that `np.random.seed(39)` is used to keep all the random
function calls consistent. This helps to grade your work."
]
},
{
"cell_type": "code",
"metadata": {
"id": "TePQVhEoHSTH"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots\
n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"np.random.seed(39)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "PrXYrrFMOkVB"
},
"source": [
"<a name='2'></a>\n",
"## **2. Building convolutional layer**\n",
"\n",
"In this part, you will build every step of the convolution layer. You will
first implement two helper functions: one for zero padding and the other for
computing the convolution function itself."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xwYcTBfnQSDi"
},
"source": [
"<a name='2.1'></a>\n",
"### **2.1 Padding**\n",
"\n",
"Padding operation adds specified values (we will use zero-padding) around
the border of an image:"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZTRS_NkfTUHO"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4k3ZRsUZfxEU"
},
"source": [
"#### **Prerequisites for using padding:**\n",
"* to prevent shrinking the image at each convolution step (this is
especially critical for deep neural networks with a large number of layers);\n",
"* to prevent throwing away information near the edges of the image
(because with no-padding pixels near the edge are used in convolution much less
than pixels in the middle)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xBxZxRrhWP1X"
},
"source": [
"#### **Valid and same convolutions**\n",
"The most common cases in CNN's are valid and same convolutions.\n",
"* `«valid»` convolution means no padding. In this case, the size of the
output image after convolution operation is changed the following way: $[n\\times
n] * [f\\times f] \\rightarrow [(n - f + 1) \\times (n - f + 1)] $, where $n\\times
n$ is the input image size and $f\\times f$ is the size of the convolutional
kernel.\n",
"* `«same»` convolution means using padding so the output size of the image
after convolution is the same as the input size. The appropriate padding $p$ in
this case is selected taking into account the size of the input image $n$ and the
size of the convolution kernel $f$."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "upPoaCpGia9r"
},
"source": [
"<a name='2.1.1'></a>\n",
"#### **TASK 1 (answer the folowing questions)**\n",
"\n",
"1. With the specified image and the convolution kernel sizes $n, f$, write
how padding $p$ will be calculated if we want the image size to stay the same after
applying the convolution operation.\n",
"\n",
"2. We have an image of size $128\\times 128$ and we want to perform a
«same» convolution operation with a kernel of size $5\\times 5$. Which pad should
we use?\n",
"\n",
"3. When using the «same» convolution, what restriction is reasonable to
impose on the size of the convolution kernel and why?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MB3XgflOmt-x"
},
"source": [
"#### **ANSWERS TO TASK 1**\n",
"\n",
"Please, specify here your answers to the questions above.\n",
"\n",
"**P.S.** You can write the formula between `$$` symbols in the text cell
below, using LaTEX notation"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_HyCQa1xnRFG"
},
"source": [
"1. $p=some\\_formula$\n",
"2. $p=some\\_value$\n",
"3. Some text"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4iCkVeT3nlRF"
},
"source": [
"<a name='2.1.2'></a>\n",
"#### **TASK 2 (implement zero-padding function)**\n",
"\n",
"Implement the following function, which pads all the images of a batch of
examples X with zeros.\n",
"You can use
[np.pad](https://docs.scipy.org/doc/numpy/reference/generated/numpy.pad.html)
function to do this."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ML51l2i00GN2"
},
"source": [
""
]
},
{
"cell_type": "code",
"metadata": {
"id": "Npwhed8Onpoj"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def zero_padding(X, pad):\n",
" \"\"\"\n",
" Given the batch X of images (in the form of the numpy array), pad with
zeros\n",
" all images of the batch. An example of how the dimensions should
change\n",
" after padding you can see in Figure 3.\n",
"\n",
" Required parameters:\n",
" X -- python numpy array of shape (m, n_H, n_W, n_C) representing a
batch of m images, each of size (n_H, n_W, n_C)\n",
" pad -- integer, amount of padding around each image on vertical and
horizontal dimensions\n",
"\n",
" Returned value:\n",
" X_pad -- padded image of shape (m, n_H + 2 * pad, n_W + 2 * pad, n_C)\
n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" X_pad = None\n",
"\n",
" return X_pad"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "dqk37uUF6wKo"
},
"source": [
"You can check yourself using the code bellow. All assertions must be
fulfilled."
]
},
{
"cell_type": "code",
"metadata": {
"id": "mgT31UjPrc0p"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"x = np.random.randn(4, 5, 5, 3)\n",
"x_pad = zero_padding(x, 2)\n",
"\n",
"print (\"Initial batch size:\", x.shape)\n",
"print (\"Batch size after padding:\", x_pad.shape)\n",
"\n",
"assert type(x_pad) == np.ndarray, \"Output must be a numpy array\"\n",
"assert x_pad.shape == (4, 9, 9, 3), f\"Wrong shape: {x_pad.shape} != (4,
9, 9, 3)\"\n",
"\n",
"assert np.allclose(x_pad[0, 0:1,:, 0], [[0, 0, 0, 0, 0, 0, 0, 0, 0], [0,
0, 0, 0, 0, 0, 0, 0, 0]], 1e-15), \"Rows are not padded with zeros\"\n",
"assert np.allclose(x_pad[0, :, 8:9, 1].transpose(), [[0, 0, 0, 0, 0, 0, 0,
0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0]], 1e-15), \"Columns are not padded with zeros\"\
n",
"assert np.allclose(x_pad[:, 2:7, 2:7, :], x, 1e-15), \"Internal values are
different\"\n",
"\n",
"print(\"All tests passed.\")\n",
"\n",
"fig, axarr = plt.subplots(1, 2)\n",
"axarr[0].set_title('Initial X \\n (img1, channel 1)')\n",
"axarr[0].imshow(x[0, :, :, 0])\n",
"axarr[1].set_title('Padded X \\n (img1, channel 1)')\n",
"axarr[1].imshow(x_pad[0, :, :, 0])\n",
"fig.show()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ccMJp8biQWA6"
},
"source": [
"<a name='2.2'></a>\n",
"### 2.2 Convolutional layer\n",
"\n",
"In this part, you will:\n",
"* implement a function which represents a single step of convolution (in
which you will apply the kernel to a single slice of the input);\n",
"* use this function to construct the whole convolutional unit, which takes
an input volume and applies a kernel at every slice of the input.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mIzcM8GT_eXA"
},
"source": [
"<a name='2.2.1'></a>\n",
"#### **TASK 3 (implement single step convolution function)**\n",
"\n",
"In the single step convolution function you should apply convolutional
kernel on a slice of the input image."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zP1hy62q_4et"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CcosV1iTVl01"
},
"source": [
""
]
},
{
"cell_type": "code",
"metadata": {
"id": "612c02RzMtOE"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def conv_step(a_prev_slice, W, b):\n",
" \"\"\"\n",
" Apply one convolutional kernel (defined by matrix W) on a single
slice\n",
" of the output activation of the previous layer (a_prev_slice).\n",
"\n",
" Required parameters:\n",
" a_prev_slice -- slice of input data of shape (f, f, n_C_prev)\n",
" W -- weight parameters of kernel - python numpy array of shape (f, f,
n_C_prev)\n",
" b -- bias parameters - python numpy array of shape (1, 1, 1)\n",
"\n",
" Returned value:\n",
" Z -- a scalar value (float), the result of convolving the sliding
window (W, b) on a slice a_prev_slice of the input data\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" # Element-wise product between a_prev_slice and W and the following
summation.\n",
" Z = None\n",
" # Add bias b to Z. Cast b to a float() so that Z results in a scalar
value.\n",
" Z = None\n",
"\n",
" return Z"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "2Z7T-WpzmUuC"
},
"source": [
"You can check yourself using the code bellow. All assertions must be
fulfilled."
]
},
{
"cell_type": "code",
"metadata": {
"id": "2LGZhRwvHi9W"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"np.random.seed(39)\n",
"a_slice_prev = np.random.randn(4, 4, 3)\n",
"W = np.random.randn(4, 4, 3)\n",
"b = np.random.randn(1, 1, 1)\n",
"\n",
"Z = conv_step(a_slice_prev, W, b)\n",
"print(\"Z value:\", Z)\n",
"\n",
"assert (type(Z) == np.float64 or type(Z) == np.float32), \"You must cast
the output to float\"\n",
"assert np.isclose(Z, 3.7943827223933244), \"Wrong value\"\n",
"\n",
"print(\"All tests passed.\")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "XYXZa0FUcGHH"
},
"source": [
"<a name='2.3.1'></a>\n",
"#### **TASK 4 (implement the convolution function - forward pass)**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ea6mGCTLQ7hd"
},
"source": [
"At this step you will take several kernels and convolve them on the 3D
input (output activations of the previous layer). Convolution with each kernel
gives you a 2D matrix output. You will then stack these outputs to get a 3D volume
again."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fVOoSAcrRzDO"
},
"source": [
""
]
},
{
"cell_type": "code",
"metadata": {
"id": "JxzZOxM-HoBm"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def conv_forward(A_prev, W, b, hparameters):\n",
" \"\"\"\n",
" Implements the forward propagation for a convolution function\n",
"\n",
" Required parameters:\n",
" A_prev -- output activations of the previous layer,\n",
" numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)\n",
" W -- kernels' weights, numpy array of shape (f, f, n_C_prev, n_C)\n",
" b -- biases, numpy array of shape (1, 1, 1, n_C)\n",
" hparameters -- python dictionary containing \"stride\" and \"pad\"\n",
"\n",
" Returned value:\n",
" Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)\n",
" cache -- cache of values needed for the conv_backward() function\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" # Get dimensions of the previous layer's output\n",
" (m, n_H_prev, n_W_prev, n_C_prev) = None\n",
"\n",
" # Get dimensions of kernels' weights\n",
" (f, f, n_C_prev, n_C) = None\n",
"\n",
" # Get information about stride and pad\n",
" stride = None\n",
" pad = None\n",
"\n",
" # Compute the dimensions of the convolution output (for given
n_H_prev, n_W_prev, f, pad and stride)\n",
" n_H = None\n",
" n_W = None\n",
"\n",
" # Initialize the output volume Z with zeros\n",
" Z = None\n",
"\n",
" # Create A_prev_pad by padding A_prev\n",
" A_prev_pad = None\n",
"\n",
" for i in range(m): # loop over the batch of training
examples\n",
" a_prev_pad = None # select i-th training example's padded
activation\n",
" for h in range(n_H):\n",
" # Find the vertical start and end of the current \"slice\"\n",
" vert_start = None\n",
" vert_end = None\n",
"\n",
" for w in range(n_W):\n",
" # Find the horizontal start and end of the
current \"slice\"\n",
" horiz_start = None\n",
" horiz_end = None\n",
"\n",
" for c in range(n_C):\n",
"\n",
" # Define the 3D-slice of a_prev_pad for each kernel\
n",
" a_slice_prev = None\n",
"\n",
" # Convolve the 3D-slice with the current kernel W and
bias b, to get back one output neuron\n",
" weights = None\n",
" biases = None\n",
" Z[i, h, w, c] = None\n",
"\n",
" # Save information in \"cache\" for the backpropogation\n",
" cache = (A_prev, W, b, hparameters)\n",
"\n",
" return Z, cache"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "utR7QVT2mWd2"
},
"source": [
"You can check yourself using the code bellow. All assertions must be
fulfilled."
]
},
{
"cell_type": "code",
"metadata": {
"id": "QxzyLLZaH88n"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"np.random.seed(39)\n",
"A_prev = np.random.randn(2, 5, 7, 4)\n",
"W = np.random.randn(3, 3, 4, 8)\n",
"b = np.random.randn(1, 1, 1, 8)\n",
"hparameters = {\"pad\" : 1,\n",
" \"stride\": 1}\n",
"\n",
"Z, cache_conv = conv_forward(A_prev, W, b, hparameters)\n",
"print(\"Z's mean:\\n\", np.mean(Z))\n",
"print(\"Z's shape:\", Z.shape)\n",
"assert np.isclose(np.mean(Z), 0.13514621895605708), \"Wrong value of Z's
mean\"\n",
"assert Z.shape == (2, 5, 7, 8), \"Wrong value of Z's mean\"\n",
"print(\"All tests passed.\")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "W5k8rCljk1zs"
},
"source": [
"<a name='3.1'></a>\n",
"#### **TASK 5 (implement RELU and sigmoid activation functions)**\n",
"\n",
"Implement RELU and sigmoid activation functions.\n",
"\n",
"**Reminder:**\n",
"\n",
"* $RELU(Z) = \\begin{cases}\n",
" 0 &Z<0 \\\\\n",
" Z &Z\\ge 0 \\\\\n",
" \\end{cases}$\n",
"* $SIGM(Z) = \\frac{1}{1+e^{-Z}}$"
]
},
{
"cell_type": "code",
"metadata": {
"id": "rhxlKUsxiJN0"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def relu_activation(Z):\n",
" \"\"\"\n",
" Implements the RELU activation function\n",
"\n",
" Required parameters:\n",
" Z -- conv output,\n",
" numpy array of shape (m, n_H, n_W, n_C)\n",
"\n",
" Returned value:\n",
" g -- conv output after applying RELU activation,\n",
" numpy array of shape (m, n_H, n_W, n_C)\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" g = None\n",
" return g\n",
"\n",
"def sigmoid_activation(Z):\n",
" \"\"\"\n",
" Implements the sigmoid activation function\n",
"\n",
" Required parameters:\n",
" Z -- conv output,\n",
" numpy array of shape (m, n_H, n_W, n_C)\n",
"\n",
" Returned value:\n",
" g -- conv output after applying sigmoid activation,\n",
" numpy array of shape (m, n_H, n_W, n_C)\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" g = None\n",
" return g"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "72JmCXoAmZJP"
},
"source": [
"You can check yourself using the code bellow. All assertions must be
fulfilled."
]
},
{
"cell_type": "code",
"metadata": {
"id": "OynTXx94lDzF"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"Z_relu_act = relu_activation(Z)\n",
"Z_sigm_act = sigmoid_activation(Z)\n",
"\n",
"assert np.isclose(Z_relu_act[1, 3].mean(), 2.5478636950538975), \"Error in
RELU function\"\n",
"assert np.isclose(Z_sigm_act[0, 2].mean(), 0.5480779046894169), \"Error in
sigmoid function\"\n",
"\n",
"assert Z_relu_act.shape == Z.shape, \"The output dimension of RELU should
remain unchanged\"\n",
"assert Z_sigm_act.shape == Z.shape, \"The output dimension of RELU should
remain unchanged\"\n",
"\n",
"print(\"All tests passed.\")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ag92oncXTFg0"
},
"source": [
"<a name='4'></a>\n",
"## 3. Building pooling layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "971i2hdZTNJD"
},
"source": [
"The convolutional layer is followed by the pooling layer. It is used to
reduce the height and width of the input which helps to reduce computation and
helps make feature detectors more invariant to the position in the input.\n",
"\n",
"You will implement the two types of pooling in the function
`pool_forward`:\n",
"\n",
"* Max-pooling: slides an $(f,f)$ window over the input and stores the max
value of the window in the output.\n",
"\n",
"* Average-pooling: slides an $(f,f)$ window over the input and stores the
average value of the window in the output."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Q20tRz-AXJNC"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "T2IYCfXPXV-6"
},
"source": [
"Such pooling layers have no parameters for backpropagation to train.
However, they have the window size $f$ hyperparameter, which specifies the height
and width of the $f\\times f$ window you would compute a max or average over."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YYo2CQ89F4xx"
},
"source": [
"<a name='4.1'></a>\n",
"#### **TASK 6 (implement the pooling function - forward pass)**\n",
"\n",
"The function must support both modes of pooling operations: max and
average."
]
},
{
"cell_type": "code",
"metadata": {
"id": "spq3Qm7sSnzK"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def pool_forward(A_prev, hparameters, mode = \"max\"):\n",
" \"\"\"\n",
" Implements the forward propogation of the pooling function\n",
"\n",
" Required parameters:\n",
" A_prev -- Input data, numpy array of shape (m, n_H_prev, n_W_prev,
n_C_prev)\n",
" hparameters -- python dictionary containing \"f\" and \"stride\"\n",
" mode -- the pooling mode you would like to use, defined as a string
(\"max\" or \"average\")\n",
"\n",
" Returned value:\n",
" A -- output of the pool layer, a numpy array of shape (m, n_H, n_W,
n_C)\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" # Get dimensions from the input shape\n",
" (m, n_H_prev, n_W_prev, n_C_prev) = None\n",
"\n",
" # Get kernel size (f) and stride\n",
" f = None\n",
" stride = None\n",
"\n",
" # Compute the dimensions of the output (for given n_H_prev, n_W_prev,
f and stride)\n",
" n_H = None\n",
" n_W = None\n",
" n_C = None\n",
"\n",
" # Initialize output matrix A with zeros\n",
" A = None\n",
"\n",
" for i in range(m): # loop over the training
examples\n",
" for h in range(n_H):\n",
" # Find the vertical start and end of the current \"slice\"\n",
" vert_start = None\n",
" vert_end = None\n",
"\n",
" for w in range(n_W):\n",
" # Find the vertical start and end of the
current \"slice\"\n",
" horiz_start = None\n",
" horiz_end = None\n",
"\n",
" for c in range (n_C):\n",
"\n",
" # Define the current slice on the i-th training example
of A_prev and channel c\n",
" a_prev_slice = None\n",
"\n",
" # Compute the pooling operation on the slice.\n",
" # Use an if statement to differentiate the modes.\n",
" if mode == None: # for max pool\n",
" A[i, h, w, c] = None\n",
" elif mode == None: # for avg pool\n",
" A[i, h, w, c] = None\n",
"\n",
" # Making sure your output shape is correct\n",
" assert(A.shape == (m, n_H, n_W, n_C))\n",
"\n",
" return A"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "lBdVGlisnwTI"
},
"source": [
"You can check the correctness by comparing your output with the expected
one."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CzCdWb1xn9cE"
},
"source": [
"#### **CASE 1 (stride = 1)**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4R1ah1NLXOAP"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"np.random.seed(39)\n",
"A_prev_relu = Z_relu_act\n",
"hparameters = {\"stride\" : 1, \"f\": 3}\n",
"\n",
"A_relu_maxpool = pool_forward(A_prev_relu, hparameters, mode = \"max\")\
n",
"print(\"Pooling mode: MAX\")\n",
"print(\"A.shape: \" + str(A_relu_maxpool.shape))\n",
"print(\"A[1, 1]: \\n\", A_relu_maxpool[1, 1])\n",
"print()\n",
"A_relu_avgpool = pool_forward(A_prev_relu, hparameters, mode
= \"average\")\n",
"print(\"Pooling mode: AVERAGE\")\n",
"print(\"A.shape: \" + str(A_relu_avgpool.shape))\n",
"print(\"A[1, 1]:\\n\", A_relu_avgpool[1, 1])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "mxoYQYGWXzRI"
},
"source": [
"**Expected output**\n",
"\n",
"```\n",
"Pooling mode: MAX\n",
"A.shape: (2, 3, 5, 8)\n",
"A[1, 1]:\n",
" [[15.07582045 7.71032783 16.12802279 6.19439118 9.64073771
13.09778915\n",
" 11.87773795 4.15039229]\n",
" [15.07582045 7.71032783 8.91557313 12.00545813 7.58041262 13.09778915\
n",
" 9.08938929 6.02040284]\n",
" [10.3252384 7.71032783 9.0414924 12.00545813 9.86936013 11.65323016\
n",
" 15.13310406 6.02040284]\n",
" [10.3252384 7.6128846 10.6533091 12.00545813 9.86936013 11.65323016\
n",
" 15.13310406 6.02040284]\n",
" [10.3252384 5.08554678 10.6533091 3.53708746 9.86936013 9.15017357\
n",
" 15.13310406 5.95552545]]\n",
"\n",
"Pooling mode: AVERAGE\n",
"A.shape: (2, 3, 5, 8)\n",
"A[1, 1]:\n",
" [[5.33127645 2.43695417 2.68658157 1.35760691 1.19512353 2.89397807\n",
" 2.34817186 1.16126572]\n",
" [3.0368633 3.59191064 1.47554868 3.23213141 1.29944134 4.65134338\n",
" 2.1146648 1.61677382]\n",
" [3.36479584 3.02696239 2.79705357 2.67989079 2.39603691 4.78885834\n",
" 3.99893908 1.94500584]\n",
" [4.74357183 1.72001721 3.82466639 1.9916251 2.4754728 5.03831263\n",
" 5.07730718 1.77522889]\n",
" [5.309727 0.56506075 3.58920188 0.50506954 1.47585424 3.32282541\n",
" 3.99106557 1.56488676]]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7STKZW_xoC9b"
},
"source": [
"#### **CASE 2 (stride = 2)**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "TrnKI480XvBf"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"np.random.seed(39)\n",
"A_prev_sigm = Z_sigm_act\n",
"hparameters = {\"stride\" : 2, \"f\": 3}\n",
"\n",
"A_sigm_maxpool = pool_forward(A_prev_sigm, hparameters, mode = \"max\")\
n",
"print(\"Pooling mode: MAX\")\n",
"print(\"A.shape: \" + str(A_sigm_maxpool.shape))\n",
"print(\"A[1, 1]: \\n\", A_sigm_maxpool[1, 1])\n",
"print()\n",
"A_sigm_avgpool = pool_forward(A_prev_sigm, hparameters, mode
= \"average\")\n",
"print(\"Pooling mode: AVERAGE\")\n",
"print(\"A.shape: \" + str(A_sigm_avgpool.shape))\n",
"print(\"A[1, 1]:\\n\", A_sigm_avgpool[1, 1])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "dz4FvE6fX5Re"
},
"source": [
"**Expected output**\n",
"\n",
"```\n",
"Pooling mode: MAX\n",
"A.shape: (2, 2, 3, 8)\n",
"A[1, 1]:\n",
" [[0.99999918 0.99955203 0.99969061 0.99963987 0.99787162 0.9982779\n",
" 0.99970094 0.99872024]\n",
" [0.99996721 0.99955203 0.99988162 0.99796331 0.99999216 0.99999131\n",
" 0.99999973 0.99922657]\n",
" [0.99996721 0.40556542 0.99997638 0.9717248 0.99994827 0.99989381\n",
" 0.99999973 0.99741522]]\n",
"\n",
"Pooling mode: AVERAGE\n",
"A.shape: (2, 2, 3, 8)\n",
"A[1, 1]:\n",
" [[0.47378084 0.50547534 0.53372083 0.46169357 0.30139256 0.51171529\n",
" 0.35814472 0.60647678]\n",
" [0.45569665 0.4424509 0.55431938 0.58155932 0.6349081 0.72126667\n",
" 0.57716369 0.65838475]\n",
" [0.98039251 0.11772363 0.35284311 0.39943482 0.43790564 0.83034389\n",
" 0.44518352 0.43990833]]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5JrsFkzuX-hI"
},
"source": [
"<a name='5'></a>\n",
"## 4. Backpropogation in convolutional layer"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oacfM8GAeOcT"
},
"source": [
"This part of the notebook is devoted to backward propagation in
convolutional neural networks. All modern DL frameworks allow you to implement only
forward propogation and take care of the backprop themselves. But if you want to
understand more deeply how backprop in a convolutional network looks like, then
this part is for you."
]
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "ur50EljBnfpI"
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "gt1gxuFBf9NC"
},
"source": [
"### **Theoretical aspects of backpropogation in CNNs**\n",
"#### **Computing dA**\n",
"This is the formula for computing $dA$ with respect to the cost for a
certain kernel $W_c$ and a given training example:\n",
"\n",
"$$dA \\mathrel{+}= \\sum _{h=0} ^{n_H} \\sum_{w=0} ^{n_W} W_c \\times
dZ_{hw}$$\n",
"\n",
"Where $W_c$ is a convolution kernel and $dZ_{hw}$ is a scalar
corresponding to the gradient of the cost with respect to the output of the conv
layer Z at the $h_{th}$ row and $w_{th}$ column.\n",
"\n",
"\n",
"#### **Computing dW**\n",
"This is the formula for computing $dW_c$ ($dW_c$ is the derivative of one
kernel) with respect to the loss:\n",
"\n",
"$$dW_c \\mathrel{+}= \\sum _{h=0} ^{n_H} \\sum_{w=0} ^ {n_W} a_{slice} \\
times dZ_{hw}$$\n",
"\n",
"Where $a_{slice}$ corresponds to the slice which was used to generate the
activation $Z_{hw}$.\n",
"\n",
"#### **Computing db**\n",
"\n",
"This is the formula for computing $db$ with respect to the cost for a
certain kernel $W_c$:\n",
"\n",
"$$db = \\sum _{h=0} ^{n_H} \\sum_{w=0} ^ {n_W} dZ_{hw}$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "t8YpQbf2GSCg"
},
"source": [
"<a name='5.1'></a>\n",
"#### **TASK 7 (implement the convolution function - backward pass)**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "qx8Ycl6cGW7h"
},
"source": [
"# YOU SHOULD CHANGE THE CODE OF THIS CELL\n",
"def conv_backward(dZ, cache):\n",
" \"\"\"\n",
" Implement the backward propagation for a convolution function\n",
"\n",
" Required parameters:\n",
" dZ -- gradient of the cost with respect to the output of the conv
layer (Z), numpy array of shape (m, n_H, n_W, n_C)\n",
" cache -- cache of values needed for the conv_backward(), output of
conv_forward()\n",
"\n",
" Returned value:\n",
" dA_prev -- gradient of the cost with respect to the input of the conv
layer (A_prev),\n",
" numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)\n",
" dW -- gradient of the cost with respect to the weights of the conv
layer (W)\n",
" numpy array of shape (f, f, n_C_prev, n_C)\n",
" db -- gradient of the cost with respect to the biases of the conv
layer (b)\n",
" numpy array of shape (1, 1, 1, n_C)\n",
" \"\"\"\n",
"\n",
" # WRITE YOUR CODE HERE\n",
" # Get information from \"cache\"\n",
" (A_prev, W, b, hparameters) = None\n",
"\n",
" # Get dimensions from A_prev's shape\n",
" (m, n_H_prev, n_W_prev, n_C_prev) = None\n",
"\n",
" # Get dimensions from W's shape\n",
" (f, f, n_C_prev, n_C) = None\n",
"\n",
" # Get stride and pad\n",
" stride = None\n",
" pad = None\n",
"\n",
" # Get dimensions from dZ's shape\n",
" (m, n_H, n_W, n_C) = Z.shape\n",
"\n",
" # Initialize dA_prev, dW, db with the correct shapes\n",
" dA_prev = None\n",
" dW = None\n",
" db = None\n",
"\n",
" # Pad A_prev and dA_prev\n",
" A_prev_pad = None\n",
" dA_prev_pad = None\n",
"\n",
" for i in range(m): # loop over the training
examples\n",
"\n",
" # Select ith training example from A_prev_pad and dA_prev_pad\n",
" a_prev_pad = None\n",
" da_prev_pad = None\n",
"\n",
" for h in range(n_H):\n",
" for w in range(n_W):\n",
" for c in range(n_C):\n",
"\n",
" # Find the corners of the current \"slice\"\n",
" vert_start = None\n",
" vert_end = None\n",
" horiz_start = None\n",
" horiz_end = None\n",
"\n",
" # Define the slice from a_prev_pad\n",
" a_slice = None\n",
"\n",
" # Update gradients for the window and the kernel's
parameters (according to formulas given above)\n",
" da_prev_pad[vert_start:vert_end,
horiz_start:horiz_end, :] += None\n",
" dW[:,:,:,c] += None\n",
" db[:,:,:,c] += None\n",
"\n",
"\n",
" # Set the ith training example's dA_prev to the unpadded
da_prev_pad\n",
" # Hint: you can use array slicing for that\n",
" dA_prev[i, :, :, :] = None\n",
"\n",
" # Making sure your output shape is correct\n",
" assert(dA_prev.shape == (m, n_H_prev, n_W_prev, n_C_prev))\n",
"\n",
" return dA_prev, dW, db"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZQCsn8BS1OVr"
},
"source": [
"You can check yourself using the code bellow. All assertions must be
fulfilled."
]
},
{
"cell_type": "code",
"metadata": {
"id": "uooe0WK8owQR"
},
"source": [
"# DO NOT CHANGE THE CODE OF THIS CELL\n",
"np.random.seed(39)\n",
"A_prev = A_relu_maxpool\n",
"W = np.random.randn(2, 2, 8, 8)\n",
"b = np.random.randn(1, 1, 1, 8)\n",
"hparameters = {\"pad\" : 2,\n",
" \"stride\": 1}\n",
"Z, cache_conv = conv_forward(A_prev, W, b, hparameters)\n",
"\n",
"# Test conv_backward\n",
"dA, dW, db = conv_backward(Z, cache_conv)\n",
"\n",
"print(\"dA_mean =\", np.mean(dA))\n",
"print(\"dW_mean =\", np.mean(dW))\n",
"print(\"db_mean =\", np.mean(db))\n",
"\n",
"assert type(dA) == np.ndarray, \"Output must be a np.ndarray\"\n",
"assert type(dW) == np.ndarray, \"Output must be a np.ndarray\"\n",
"assert type(db) == np.ndarray, \"Output must be a np.ndarray\"\n",
"\n",
"assert dA.shape == (2, 3, 5, 8), f\"Wrong shape for dA {dA.shape} != (2,
3, 5, 8)\"\n",
"assert dW.shape == (2, 2, 8, 8), f\"Wrong shape for dW {dW.shape} != (2,
2, 8, 8)\"\n",
"assert db.shape == (1, 1, 1, 8), f\"Wrong shape for db {db.shape} != (1,
1, 1, 8)\"\n",
"\n",
"assert np.isclose(np.mean(dA), 431.18801309512406), \"Wrong values for
dA\"\n",
"assert np.isclose(np.mean(dW), -630.649747716205), \"Wrong values for
dW\"\n",
"assert np.isclose(np.mean(db), -127.35125113717262), \"Wrong values for
db\"\n",
"\n",
"print(\"All tests passed.\")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "biDxChgYqzE1"
},
"source": [
"### **Postscript**\n",
"\n",
"Structure and tasks of the notebook are inspired by the Andrew Ng's
course: \"Convolutional Neural Networks\"."
]
}
]
}