{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5a7abb98-5b7e-45fa-9c9a-dd7640ecf9a8",
   "metadata": {},
   "source": [
    "# Getting started with numpy\n",
    "\n",
    "In this lab exercise we present the different functionnalities of numpy, particularly its algebra library. This exercise objectives are to train you how to access, multiply and apply other operations on numpy matrix.\n",
    "\n",
    "\n",
    "## Create a numpy array\n",
    "First, the base type we will use in the remaining is the `numpy.array`object which is a multi-dimensional array (or a tensor). You can create an array giving a python list in arguments but also initialise it given specific shape, or instanciate it with zeros or ones.\n",
    "\n",
    "* `np.array([[0,1,2], [3,4,5]])` create a 2-dimensional array having two lines and 3 columns\n",
    "* `np.ndarray((5,3,4))` create a 3-dimensional array with shape 5, 3 and 4 (5 lines, 3 columns, 4 entries)\n",
    "* `np.zeros(5,3,4)` create a 3-dimensional array with shape 5, 3 and 4 (5 lines, 3 columns, 4 entries) where each componenent is set two zeros\n",
    "\n",
    "You also can fill the array with a specific value using the inplace operation `.fill(value)`. To access information on the different dimensions (size) you can use the array attribute `shape`.\n",
    "\n",
    "\n",
    "## Access sub-part of the array\n",
    "To access sub-part of a numpy array you can use index  surrounded by [], you can index on the different dimenssion using the comma between each dimensions:\n",
    "\n",
    "* `x[1,5,3]` will select the element in the first line, the fith column and the third element on the array\n",
    "\n",
    "In addition you can select slice, e.g. select the lines between a and b, using the `:` symbol:\n",
    "\n",
    "* `x[2:5]` will select the lines 2, 3 and 4\n",
    "* `x[2:]` will select elements from the second line to the last\n",
    "*  `x[:-2]` will select the lines from the first to the last third.\n",
    "\n",
    "We can combine the two last notations to select slices on different dimension:\n",
    "* `x[2:5, :, :3]` will select the elements between the lines 2 to 5 and the first three elements on the last dimension\n",
    "\n",
    "## Transpose dimension\n",
    "To transpose dimension you can use the `transpose` numpy method taking in argument the index of the transposed dimensions\n",
    "\n",
    "## Reshape a tensor\n",
    "You can use `np.reshape` on a tensor to reshape the dimension of a tensor.\n",
    "\n",
    "\n",
    "\n",
    "### Know the signature of functions (and code) using notebook\n",
    "\n",
    "If you have a doubt on parameters for a specific function, in a notebook you can use `? my_function` to get the signature of the function or `?? my_function` to get the whole code of  my_function !!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e048b0a5-3281-4f59-b961-bd5a6f19a5f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import math\n",
    "import matplotlib"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86c98e10-2a7f-4410-b4d7-79940fff85e7",
   "metadata": {},
   "source": [
    "## Exercise 1 : Manipulate np.array\n",
    "\n",
    "**Question 1:** In the following create a variable `np_minus_three` containing a `np.array`of shape (2,5,10,2) (4 dimensions), where each componenent has the value `-3`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6fc695f2-1cb0-494b-a858-54c941b18532",
   "metadata": {},
   "outputs": [],
   "source": [
    "np_minus_three = "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f55305d6-fc4e-44a3-9c8e-2a436f5db9b1",
   "metadata": {},
   "outputs": [],
   "source": [
    "assert(np.all(np_minus_three == -3))\n",
    "assert(np_minus_three.shape == (2,5,10,2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ad96fce-616d-4eaa-a72b-6e3bd7dff999",
   "metadata": {},
   "source": [
    "**Question 2:** In the following create a variable `np_range` containing a `np.array`of shape (2,5) (2 dimensions), where the array contains the list of integer from 0 to 9 (included). The first line containing integer for 0 to 5, the second from 5 to 9. You should use the functions `np.arange` and the method `reshape` (**and no list!!!**).\n",
    "\n",
    "The matrix you should obtain is the following :\n",
    " $$np\\_range= \\begin{pmatrix}\n",
    "    0 & 1 & 2 & 3& 4  \\\\\n",
    "    5 & 6 & 7 & 8 & 9    \\\\\n",
    "\\end{pmatrix}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c11e77a3-7107-4de9-a5c3-0671c6b7208f",
   "metadata": {},
   "outputs": [],
   "source": [
    "np_range = "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa2c66e0-7221-44d5-8a77-c466a163d371",
   "metadata": {},
   "outputs": [],
   "source": [
    "assert(np.all(np_range == np.array([[0, 1, 2, 3, 4],\n",
    "       [5, 6, 7, 8, 9]])))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d58051fe-0309-41de-b86d-9b534f98aa7d",
   "metadata": {},
   "source": [
    "**Question 3:** Similarly create a variable `np_range_large` containing a `np.array`of shape (2,5,10,2) (4 dimensions), where the array contains the list of integer from 0 to 199 (included)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "161d677e-9013-4bc2-a2e5-e424fd206361",
   "metadata": {},
   "outputs": [],
   "source": [
    "np_range_large = "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8d3cb52d-0255-4580-aaba-14bc2fca633b",
   "metadata": {},
   "outputs": [],
   "source": [
    "hash = '5681936f8523b54ad2d46bf26e0bc6e6bc7379ce84eb0d78846db079596eabeb'\n",
    "import hashlib\n",
    "h = hashlib.new('sha256')\n",
    "h.update(str(np_range_large).encode('utf-8'))\n",
    "assert(hash == h.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d850207e-2e65-4be6-9b60-1c33911c40d3",
   "metadata": {},
   "source": [
    "**Question 4:** From the previous array select the columns 3, 4 and 5 into the variable `np_sub`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9797ebf9-c88e-4e5b-b918-51b70c579d06",
   "metadata": {},
   "outputs": [],
   "source": [
    "np_sub = "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5187f18a-a228-404f-ac5c-735717444270",
   "metadata": {},
   "outputs": [],
   "source": [
    "hash = 'a1d90b458e5c3576b11b280526e27ecf0afa506f0238cf19f44d773f1d98330d'\n",
    "h = hashlib.new('sha256')\n",
    "h.update(str(np_sub).encode('utf-8'))\n",
    "assert(hash == h.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51030b96-545b-49af-b04e-4ff3c3b3e6da",
   "metadata": {},
   "source": [
    "**Question 5:** Let us consider variable $x \\in \\mathbb{R}^2$ and $W \\in \\mathbb{R}^{2 \\times 4}$, compute the dot product in $y$ using a loop and the function `np.sum()`. Why the objective of the line 2 (`np.random.seed(42)`)?  What is $W$ in the case of a neural network layer?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68d1bae1-ea1f-448f-999b-737c1c1e0050",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "\n",
    "x = np.array([2.5, -6.2])\n",
    "W = np.random.rand(2,4)\n",
    "y = np.zeros(4)\n",
    "\n",
    "raise NotImplementedError(\"Answer the question\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66a92a4f-480f-4581-acf0-08863d3c25bb",
   "metadata": {},
   "outputs": [],
   "source": [
    "hash = '70145392ec18208eb6aaccc15a752d819f10fa9c3dfcd2515d90acaa4a90b952'\n",
    "h = hashlib.new('sha256')\n",
    "h.update(str(y).encode('utf-8'))\n",
    "assert(hash == h.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ed27e8f-d9d8-4644-bee9-d888bbaf7b38",
   "metadata": {},
   "source": [
    "**Question 6:** Let us consider variable $X \\in \\mathbb{R}^{10 \\times 2}$, compute the dot product in $Y$ using a loop and the function `np.sum()` and the *Hadamard product* `*` for each line of $X$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ebe5354d-b8a8-45f6-988f-872886674b93",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "\n",
    "X = np.random.rand(10,2)\n",
    "W = np.random.rand(2,4)\n",
    "Y = np.zeros((10,4))\n",
    "\n",
    "raise NotImplementedError(\"Answer the question\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ea8b0e1-3602-470f-9b82-8da910af3c55",
   "metadata": {},
   "outputs": [],
   "source": [
    "hash = '2f86a4d2697294307ab0099164c2aeeb4315876b97aba275f73cfac42d345d50'\n",
    "h = hashlib.new('sha256')\n",
    "h.update(str(Y).encode('utf-8'))\n",
    "assert(hash == h.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eee7b375-5a1c-4297-9808-e935250eb419",
   "metadata": {},
   "source": [
    "## Exercise 2 : Matrix operation\n",
    "\n",
    "The scalar product as implemented canbe costly, particularly in python where loops are poorly optimized. To this end a majority of algebra like libraries implement matrix operation using a C backend. \n",
    "Numpy also propose matrix operation, particularly it implement the matrix multiplication, the hadamard product and some other optimized matrix operations.\n",
    "\n",
    "### Hadamard product\n",
    "The Hadamard produc given two matrix/tensors $A,B \\in \\mathbb{R}^{n  \\times m}$ is denoted by $\\odot$ where:\n",
    "$$(A\\odot B)_{i,j} = A_{i, j} B_{i,j}$$\n",
    "Thus (with all matrix):\n",
    "$$A \\odot B = \n",
    "\\begin{pmatrix}\n",
    "A_{1,1}B_{1,1} & A_{1,2}B_{1,2} & \\ldots & A_{1,m}B_{1,m} \\\\\n",
    "A_{2,1}B_{2,1} & A_{2,2}B_{2,2} & \\ldots & A_{2,m}B_{2,m} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "A_{n,1}B_{n,1} & A_{n,2}B_{n,2} & \\ldots & A_{n,m}B_{n,m} \\\\\n",
    "\\end{pmatrix}\n",
    "$$\n",
    "\n",
    "In numpy this operation is using the classical python multiplication symbol `*`, `A * B` being the Hadamard product of `A` and `B`.\n",
    "\n",
    "### Dot Product \n",
    "The Dot product is two matrix/tensors $A\\in \\mathbb{R}^{n  \\times m}$ and $B \\in \\mathbb{R}^{m  \\times l}$ is denoted by \"$.$\" where:\n",
    "$$(A . B)_{i, j} = \\sum\\limits_{k=1}^{m} A_{i, k} B_{k,j}$$\n",
    "Thus (picturing the whole matrix):\n",
    "$$A . B = \n",
    "\\begin{pmatrix}\n",
    "\\sum\\limits_{k=1}^{m} A_{1, k} B_{k,1} & \\sum\\limits_{k=1}^{m} A_{1, k} B_{k,2} & \\sum\\limits_{k=1}^{m} A_{1, k} B_{k,l}\\\\\n",
    "\\sum\\limits_{k=1}^{m} A_{2, k} B_{k,1} & \\sum\\limits_{k=1}^{m} A_{2, k} B_{k,2} & \\sum\\limits_{k=1}^{m} A_{2, k} B_{k,l} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\sum\\limits_{k=1}^{m} A_{n, k} B_{k,1} & \\sum\\limits_{k=1}^{m} A_{m, k} B_{k,2} & \\sum\\limits_{k=1}^{m} A_{m, k} B_{k,l} \\\\\n",
    "\\end{pmatrix}\n",
    "$$\n",
    "\n",
    "In numpy this operation is use `@` operator, `A@B` being the dot product of `A` and `B`.\n",
    "\n",
    "### Outer Product (on vectors)\n",
    "The outer product between two vectors $u\\in \\mathbb{R}^{n}$ and $v \\in \\mathbb{R}^{m}$ is denoted by $\\otimes_{outer}$ where:\n",
    "$$(u \\otimes_{outer} v)_{i, j} = u_{i} v_{j}$$\n",
    "Thus (picturing the whole matrix):\n",
    "$$u\\otimes_{outer}v =  \\begin{pmatrix}\n",
    "u_{1} \\\\\n",
    "u_{2}\\\\\n",
    "\\vdots \\\\\n",
    "u_{n}\\\\\n",
    "\\end{pmatrix}\n",
    "\\otimes_{outer}\n",
    "\\begin{pmatrix}\n",
    "v_{1} &v_{2} & \\ldots &v_{m} \\\\\n",
    "\\end{pmatrix} =\n",
    "\\begin{pmatrix}\n",
    "u_{1}v_{1} & u_{1}v_{2} & \\ldots & u_{1}v_{m} \\\\\n",
    "u_{2}v_{1} & u_{2}v_{2} & \\ldots & u_{2}v_{m} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "u_{n}v_{1} & u_{n}v_{2} & \\ldots & u_{n}v_{m} \\\\\n",
    "\\end{pmatrix}\n",
    "$$\n",
    "\n",
    "In numpy this operation is implemented with the `np.outer` function.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "160c8713-1e16-4350-b0ba-fbefdf954063",
   "metadata": {},
   "source": [
    "**Question 1:** Let us consider variable $x \\in \\mathbb{R}^2$ and $W \\in \\mathbb{R}^{2 \\times 4}$, compute $y$ resulting of the linear application (W) on $x$.\n",
    "\n",
    "NB: You can look at the transpose method of numpy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7869e13e-3871-471c-b8c2-da189f0099e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "\n",
    "x = np.array([2.5, -6.2])\n",
    "W = np.random.rand(2,4)\n",
    "raise NotImplementedError(\"Answer the question\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b8e0f473-771a-4081-9df6-383f66f40831",
   "metadata": {},
   "outputs": [],
   "source": [
    "hash = '70145392ec18208eb6aaccc15a752d819f10fa9c3dfcd2515d90acaa4a90b952'\n",
    "h = hashlib.new('sha256')\n",
    "h.update(str(y).encode('utf-8'))\n",
    "assert(hash == h.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ea5f132-db14-4020-981b-3a499bbd8a47",
   "metadata": {},
   "source": [
    "**Question 2:** Let us consider variable $X \\in \\mathbb{R}^{10 \\times 2}$, compute the dot product  for each line of $X$ with $W$ into $Y$ with only one loop!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28430f3b-badd-439f-a404-cd9b39aa5548",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "\n",
    "X = np.random.rand(10,2)\n",
    "W = np.random.rand(2,4)\n",
    "Y = np.zeros((10,4))\n",
    "\n",
    "# TODO\n",
    "raise NotImplementedError(\"Answer the question\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f79a918e-bf1c-44d6-b1b9-b9cdbfb20433",
   "metadata": {},
   "source": [
    "**Question 3:** Let us consider variable $X \\in \\mathbb{R}^{10 \\times 2}$, compute the dot of $X$ with $W$ into $Y$ with no loop!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "918de122-41e7-403b-93a5-a28113967e64",
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError(\"Answer the question\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b12aefb-a7bf-4ea4-a01d-ecaf9ca059ce",
   "metadata": {},
   "source": [
    "## Exercise 3 : Apply a function\n",
    "\n",
    "Numpy propose a large amount of [built-in functions](https://numpy.org/doc/stable/reference/routines.math.html) e.g. cosinus, tanh... By default those functions are applied piecewise. \n",
    "\n",
    "**Question 1:** In this exercise the objective is to build a function $\\mathcal{f} : \\mathbb{R}^n \\to ]0,1[$ such that \n",
    "$$\n",
    "\\mathcal{f}(x) = \\frac{1}{1 + e^{-(w^{\\intercal}x + b)}}\n",
    "$$\n",
    "With $w \\in \\mathbb{R}^{n}$, $x\\in \\mathbb{R}^{n}$ and $b \\in \\mathbb{R}$.\n",
    "Create an object having as attribute $w$ and $b$ implementing the function in a method called forward"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "874c4308-1823-4079-bb00-c086bf0b9e5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "class LinearClassifier(object):\n",
    "    def __init__(self, dim=5):\n",
    "        self.w = np.random.rand((dim))\n",
    "        self.b = random.random()\n",
    "    \n",
    "    def forward(self, x):\n",
    "        raise NotImplementedError('You should implement this function (return f(x))')    \n",
    "\n",
    "dim = 5\n",
    "x = np.random.rand((5))\n",
    "f_x = LinearClassifier().forward(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b783941b-7568-4aac-b2a9-bd8c5677b2aa",
   "metadata": {},
   "source": [
    "**Question 2:** In this exercise the objective is to build a function $\\mathcal{g} : \\mathbb{R} \\times \\mathbb{R} \\to \\mathbb{R}$ such that \n",
    "$$\n",
    "\\mathcal{g}(x, y) = -(ylog(x) + (1-y) log(1-x))\n",
    "$$\n",
    "Create an object implementing the function in a method called forward"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9fb940f0-54d5-427b-955d-116a42084215",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "class CrossEntropy(object):\n",
    "    def __init__(self):\n",
    "        pass\n",
    "    \n",
    "    def forward(self, x, y):\n",
    "        raise NotImplementedError('You should implement this function (return f(x))')    "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}