{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "282817dd",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T20:08:23.900532Z",
     "start_time": "2023-06-27T20:08:22.963157Z"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def print_info(a):\n",
    "    \"\"\" Print the content of an array, and its metadata. \"\"\"\n",
    "    \n",
    "    txt = f\"\"\"\n",
    "dtype\\t{a.dtype}\n",
    "ndim\\t{a.ndim}\n",
    "shape\\t{a.shape}\n",
    "strides\\t{a.strides}\n",
    "    \"\"\"\n",
    "\n",
    "    print(a)\n",
    "    print(txt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6cd0f8cf",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "<font size=9> Mind-on exercises </font>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acba732f",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Exercise 1: warm up\n",
    "\n",
    "```What is the expected output shape for each operation?```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a41d0f74",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.881059Z",
     "start_time": "2023-06-27T19:58:57.830Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "a = np.arange(5)\n",
    "b = 5\n",
    "\n",
    "np.shape(a-b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6f82a2fb",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.884966Z",
     "start_time": "2023-06-27T19:58:57.833Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "a = np.ones((7, 1))\n",
    "b = np.arange(7)\n",
    "np.shape(a*b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "808095ad",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.888119Z",
     "start_time": "2023-06-27T19:58:57.836Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "a = np.random.randint(0, 50, (2, 3, 3))\n",
    "b = np.random.randint(0, 10, (3, 1))\n",
    "\n",
    "np.shape(a-b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9a12a90",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.891462Z",
     "start_time": "2023-06-27T19:58:57.839Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "a = np.arange(100).reshape(10, 10)\n",
    "b = np.arange(1, 10)\n",
    "\n",
    "np.shape(a+b)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69632f95",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Exercise 2\n",
    "\n",
    "```\n",
    "1. Create a random 2D array of dimension (5, 3)\n",
    "2. Calculate the maximum value of each row\n",
    "3. Divide each row by its maximum\n",
    "```\n",
    "\n",
    "Remember to use broadcasting : NO FOR LOOPS!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "54e2a53e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.894433Z",
     "start_time": "2023-06-27T19:58:57.843Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "## Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9facc0f",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Exercise 3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e8156d0",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Task: Find the closest **cluster** to the **observation**. \n",
    "\n",
    "Again, use broadcasting: DO NOT iterate cluster by cluster"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2969994e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.899204Z",
     "start_time": "2023-06-27T19:58:57.847Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "observation = np.array([30.0, 99.0]) #Observation\n",
    "\n",
    "#Clusters\n",
    "clusters = np.array([[102.0, 203.0],\n",
    "             [132.0, 193.0],\n",
    "            [45.0, 155.0], \n",
    "            [57.0, 173.0]])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f13352ff",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Lets plot this data\n",
    "\n",
    "In the plot below, **+** is the observation and dots are the cluster coordinates"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b9f6b5cf",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.906715Z",
     "start_time": "2023-06-27T19:58:57.850Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt \n",
    "\n",
    "plt.scatter(clusters[:, 0], clusters[:, 1]) #Scatter plot of clusters\n",
    "for n, x in enumerate(clusters):\n",
    "    print('cluster %d' %n)\n",
    "    plt.annotate('cluster%d' %n, (x[0], x[1])) #Label each cluster\n",
    "plt.plot(observation[0], observation[1], '+'); #Plot observation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f9b84e2",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Closest cluster as seen by the plot is **2**. Your task is to write a function to calculate this"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aea6781",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-26T19:25:08.202848Z",
     "start_time": "2023-06-26T19:25:08.194923Z"
    }
   },
   "source": [
    "\n",
    "**hint:** Find the distance between the observation and each row in the cluster. The cluster to which the observation belongs to is the row with the minimum distance.\n",
    "\n",
    "distance = $\\sqrt {\\left( {x_1 - x_2 } \\right)^2 + \\left( {y_1 - y_2 } \\right)^2 }$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea8a7240",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-27T19:58:58.916610Z",
     "start_time": "2023-06-27T19:58:57.854Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "## Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "beaee243",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "## Sources + Resources\n",
    "\n",
    "ASPP 2016 - Stéfan van der Walt - https://github.com/ASPP/2016_numpy\n",
    "\n",
    "Basic Numpy: http://scipy-lectures.org/intro/numpy/index.html\n",
    "\n",
    "Advanced Numpy: http://scipy-lectures.org/advanced/advanced_numpy/index.html\n",
    "\n",
    "Numpy chapter in \"Python Data Science Handbook\" https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  },
  "rise": {
   "scroll": true
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}