ASPP 2024 material

2024-08-27 15:27:53 +03:00 · 2024-08-27 15:27:53 +03:00 · 1f6bc07c51
commit 1f6bc07c51
90 changed files with 91689 additions and 0 deletions
--- a/exercises/numpy_broadcasting_extra/broadcasting.ipynb
+++ b/exercises/numpy_broadcasting_extra/broadcasting.ipynb
@ -0,0 +1,374 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "282817dd",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T20:08:23.900532Z",
+     "start_time": "2023-06-27T20:08:22.963157Z"
+    },
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "def print_info(a):\n",
+    "    \"\"\" Print the content of an array, and its metadata. \"\"\"\n",
+    "    \n",
+    "    txt = f\"\"\"\n",
+    "dtype\\t{a.dtype}\n",
+    "ndim\\t{a.ndim}\n",
+    "shape\\t{a.shape}\n",
+    "strides\\t{a.strides}\n",
+    "    \"\"\"\n",
+    "\n",
+    "    print(a)\n",
+    "    print(txt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6cd0f8cf",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "<font size=9> Mind-on exercises </font>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "acba732f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Exercise 1: warm up\n",
+    "\n",
+    "```What is the expected output shape for each operation?```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a41d0f74",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.881059Z",
+     "start_time": "2023-06-27T19:58:57.830Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a = np.arange(5)\n",
+    "b = 5\n",
+    "\n",
+    "np.shape(a-b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6f82a2fb",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.884966Z",
+     "start_time": "2023-06-27T19:58:57.833Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a = np.ones((7, 1))\n",
+    "b = np.arange(7)\n",
+    "np.shape(a*b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "808095ad",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.888119Z",
+     "start_time": "2023-06-27T19:58:57.836Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a = np.random.randint(0, 50, (2, 3, 3))\n",
+    "b = np.random.randint(0, 10, (3, 1))\n",
+    "\n",
+    "np.shape(a-b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d9a12a90",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.891462Z",
+     "start_time": "2023-06-27T19:58:57.839Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a = np.arange(100).reshape(10, 10)\n",
+    "b = np.arange(1, 10)\n",
+    "\n",
+    "np.shape(a+b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "69632f95",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Exercise 2\n",
+    "\n",
+    "```\n",
+    "1. Create a random 2D array of dimension (5, 3)\n",
+    "2. Calculate the maximum value of each row\n",
+    "3. Divide each row by its maximum\n",
+    "```\n",
+    "\n",
+    "Remember to use broadcasting : NO FOR LOOPS!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "54e2a53e",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.894433Z",
+     "start_time": "2023-06-27T19:58:57.843Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "## Your code here"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9facc0f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Exercise 3"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7e8156d0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Task: Find the closest **cluster** to the **observation**. \n",
+    "\n",
+    "Again, use broadcasting: DO NOT iterate cluster by cluster"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2969994e",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.899204Z",
+     "start_time": "2023-06-27T19:58:57.847Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "observation = np.array([30.0, 99.0]) #Observation\n",
+    "\n",
+    "#Clusters\n",
+    "clusters = np.array([[102.0, 203.0],\n",
+    "             [132.0, 193.0],\n",
+    "            [45.0, 155.0], \n",
+    "            [57.0, 173.0]])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f13352ff",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Lets plot this data\n",
+    "\n",
+    "In the plot below, **+** is the observation and dots are the cluster coordinates"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b9f6b5cf",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.906715Z",
+     "start_time": "2023-06-27T19:58:57.850Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt \n",
+    "\n",
+    "plt.scatter(clusters[:, 0], clusters[:, 1]) #Scatter plot of clusters\n",
+    "for n, x in enumerate(clusters):\n",
+    "    print('cluster %d' %n)\n",
+    "    plt.annotate('cluster%d' %n, (x[0], x[1])) #Label each cluster\n",
+    "plt.plot(observation[0], observation[1], '+'); #Plot observation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4f9b84e2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Closest cluster as seen by the plot is **2**. Your task is to write a function to calculate this"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8aea6781",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-26T19:25:08.202848Z",
+     "start_time": "2023-06-26T19:25:08.194923Z"
+    }
+   },
+   "source": [
+    "\n",
+    "**hint:** Find the distance between the observation and each row in the cluster. The cluster to which the observation belongs to is the row with the minimum distance.\n",
+    "\n",
+    "distance = $\\sqrt {\\left( {x_1 - x_2 } \\right)^2 + \\left( {y_1 - y_2 } \\right)^2 }$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ea8a7240",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-27T19:58:58.916610Z",
+     "start_time": "2023-06-27T19:58:57.854Z"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "## Your code here"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "beaee243",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Sources + Resources\n",
+    "\n",
+    "ASPP 2016 - Stéfan van der Walt - https://github.com/ASPP/2016_numpy\n",
+    "\n",
+    "Basic Numpy: http://scipy-lectures.org/intro/numpy/index.html\n",
+    "\n",
+    "Advanced Numpy: http://scipy-lectures.org/advanced/advanced_numpy/index.html\n",
+    "\n",
+    "Numpy chapter in \"Python Data Science Handbook\" https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html"
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  },
+  "rise": {
+   "scroll": true
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}