375 lines
7.9 KiB
Plaintext
375 lines
7.9 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "282817dd",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T20:08:23.900532Z",
|
||
|
"start_time": "2023-06-27T20:08:22.963157Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "skip"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import numpy as np\n",
|
||
|
"\n",
|
||
|
"def print_info(a):\n",
|
||
|
" \"\"\" Print the content of an array, and its metadata. \"\"\"\n",
|
||
|
" \n",
|
||
|
" txt = f\"\"\"\n",
|
||
|
"dtype\\t{a.dtype}\n",
|
||
|
"ndim\\t{a.ndim}\n",
|
||
|
"shape\\t{a.shape}\n",
|
||
|
"strides\\t{a.strides}\n",
|
||
|
" \"\"\"\n",
|
||
|
"\n",
|
||
|
" print(a)\n",
|
||
|
" print(txt)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "6cd0f8cf",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"<font size=9> Mind-on exercises </font>"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "acba732f",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Exercise 1: warm up\n",
|
||
|
"\n",
|
||
|
"```What is the expected output shape for each operation?```"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "a41d0f74",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.881059Z",
|
||
|
"start_time": "2023-06-27T19:58:57.830Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"a = np.arange(5)\n",
|
||
|
"b = 5\n",
|
||
|
"\n",
|
||
|
"np.shape(a-b)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "6f82a2fb",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.884966Z",
|
||
|
"start_time": "2023-06-27T19:58:57.833Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"a = np.ones((7, 1))\n",
|
||
|
"b = np.arange(7)\n",
|
||
|
"np.shape(a*b)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "808095ad",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.888119Z",
|
||
|
"start_time": "2023-06-27T19:58:57.836Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"a = np.random.randint(0, 50, (2, 3, 3))\n",
|
||
|
"b = np.random.randint(0, 10, (3, 1))\n",
|
||
|
"\n",
|
||
|
"np.shape(a-b)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "d9a12a90",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.891462Z",
|
||
|
"start_time": "2023-06-27T19:58:57.839Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"a = np.arange(100).reshape(10, 10)\n",
|
||
|
"b = np.arange(1, 10)\n",
|
||
|
"\n",
|
||
|
"np.shape(a+b)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "69632f95",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Exercise 2\n",
|
||
|
"\n",
|
||
|
"```\n",
|
||
|
"1. Create a random 2D array of dimension (5, 3)\n",
|
||
|
"2. Calculate the maximum value of each row\n",
|
||
|
"3. Divide each row by its maximum\n",
|
||
|
"```\n",
|
||
|
"\n",
|
||
|
"Remember to use broadcasting : NO FOR LOOPS!"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "54e2a53e",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.894433Z",
|
||
|
"start_time": "2023-06-27T19:58:57.843Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"## Your code here"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "b9facc0f",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "slide"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"### Exercise 3"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "7e8156d0",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Task: Find the closest **cluster** to the **observation**. \n",
|
||
|
"\n",
|
||
|
"Again, use broadcasting: DO NOT iterate cluster by cluster"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "2969994e",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.899204Z",
|
||
|
"start_time": "2023-06-27T19:58:57.847Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"observation = np.array([30.0, 99.0]) #Observation\n",
|
||
|
"\n",
|
||
|
"#Clusters\n",
|
||
|
"clusters = np.array([[102.0, 203.0],\n",
|
||
|
" [132.0, 193.0],\n",
|
||
|
" [45.0, 155.0], \n",
|
||
|
" [57.0, 173.0]])"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "f13352ff",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Lets plot this data\n",
|
||
|
"\n",
|
||
|
"In the plot below, **+** is the observation and dots are the cluster coordinates"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "b9f6b5cf",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.906715Z",
|
||
|
"start_time": "2023-06-27T19:58:57.850Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import matplotlib.pyplot as plt \n",
|
||
|
"\n",
|
||
|
"plt.scatter(clusters[:, 0], clusters[:, 1]) #Scatter plot of clusters\n",
|
||
|
"for n, x in enumerate(clusters):\n",
|
||
|
" print('cluster %d' %n)\n",
|
||
|
" plt.annotate('cluster%d' %n, (x[0], x[1])) #Label each cluster\n",
|
||
|
"plt.plot(observation[0], observation[1], '+'); #Plot observation"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "4f9b84e2",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"Closest cluster as seen by the plot is **2**. Your task is to write a function to calculate this"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "8aea6781",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-26T19:25:08.202848Z",
|
||
|
"start_time": "2023-06-26T19:25:08.194923Z"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"\n",
|
||
|
"**hint:** Find the distance between the observation and each row in the cluster. The cluster to which the observation belongs to is the row with the minimum distance.\n",
|
||
|
"\n",
|
||
|
"distance = $\\sqrt {\\left( {x_1 - x_2 } \\right)^2 + \\left( {y_1 - y_2 } \\right)^2 }$"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "ea8a7240",
|
||
|
"metadata": {
|
||
|
"ExecuteTime": {
|
||
|
"end_time": "2023-06-27T19:58:58.916610Z",
|
||
|
"start_time": "2023-06-27T19:58:57.854Z"
|
||
|
},
|
||
|
"slideshow": {
|
||
|
"slide_type": "fragment"
|
||
|
}
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"## Your code here"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "beaee243",
|
||
|
"metadata": {
|
||
|
"slideshow": {
|
||
|
"slide_type": "skip"
|
||
|
}
|
||
|
},
|
||
|
"source": [
|
||
|
"## Sources + Resources\n",
|
||
|
"\n",
|
||
|
"ASPP 2016 - Stéfan van der Walt - https://github.com/ASPP/2016_numpy\n",
|
||
|
"\n",
|
||
|
"Basic Numpy: http://scipy-lectures.org/intro/numpy/index.html\n",
|
||
|
"\n",
|
||
|
"Advanced Numpy: http://scipy-lectures.org/advanced/advanced_numpy/index.html\n",
|
||
|
"\n",
|
||
|
"Numpy chapter in \"Python Data Science Handbook\" https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"celltoolbar": "Slideshow",
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3 (ipykernel)",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.11.3"
|
||
|
},
|
||
|
"rise": {
|
||
|
"scroll": true
|
||
|
},
|
||
|
"toc": {
|
||
|
"base_numbering": 1,
|
||
|
"nav_menu": {},
|
||
|
"number_sections": true,
|
||
|
"sideBar": true,
|
||
|
"skip_h1_title": false,
|
||
|
"title_cell": "Table of Contents",
|
||
|
"title_sidebar": "Contents",
|
||
|
"toc_cell": false,
|
||
|
"toc_position": {},
|
||
|
"toc_section_display": true,
|
||
|
"toc_window_display": false
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 5
|
||
|
}
|