{ "cells": [ { "cell_type": "markdown", "id": "247bbf84", "metadata": {}, "source": [ "# Window functions for tabular data" ] }, { "cell_type": "code", "execution_count": 1, "id": "44584190", "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "id": "83bbd275", "metadata": {}, "source": [ "# Load experimental data" ] }, { "cell_type": "code", "execution_count": 2, "id": "88b9e189", "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('timed_responses.csv', index_col=0)" ] }, { "cell_type": "code", "execution_count": 3, "id": "987a3518", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
subject_idtime (ms)responseaccuracy
5743540RIGHT0.04
11902552LEFT0.43
189521036LEFT0.36
533257RIGHT0.11
1582743RIGHT0.32
5513619LEFT0.25
1602143RIGHT0.65
4131471LEFT0.80
7851121LEFT0.10
13932903RIGHT0.33
6292353LEFT0.17
18293768RIGHT0.26
90211093LEFT0.34
148623RIGHT0.29
\n", "
" ], "text/plain": [ " subject_id time (ms) response accuracy\n", "574 3 540 RIGHT 0.04\n", "1190 2 552 LEFT 0.43\n", "1895 2 1036 LEFT 0.36\n", "53 3 257 RIGHT 0.11\n", "158 2 743 RIGHT 0.32\n", "551 3 619 LEFT 0.25\n", "1602 1 43 RIGHT 0.65\n", "413 1 471 LEFT 0.80\n", "785 1 121 LEFT 0.10\n", "1393 2 903 RIGHT 0.33\n", "629 2 353 LEFT 0.17\n", "1829 3 768 RIGHT 0.26\n", "902 1 1093 LEFT 0.34\n", "1486 2 3 RIGHT 0.29" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "markdown", "id": "5c41cd93", "metadata": {}, "source": [ "# Split-apply-combine operations return one aggregated value per group" ] }, { "cell_type": "code", "execution_count": null, "id": "0234ccf2", "metadata": {}, "outputs": [], "source": [ "df.groupby('subject_id')['accuracy'].max()" ] }, { "cell_type": "code", "execution_count": null, "id": "2b2a1796", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "2bb99152", "metadata": {}, "source": [ "# However, for some calculations we need to have a value per row\n", "\n", "For example: for each subject, rank the responses by decreasing accuracy" ] }, { "cell_type": "code", "execution_count": null, "id": "3aed0755", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "17f3d40f", "metadata": {}, "source": [ "# In many cases, a window functions is combined with a sorting operation\n", "\n", "For example: for each subject, count the number of \"LEFT\" responses up until any moment in the experiment" ] }, { "cell_type": "code", "execution_count": null, "id": "67efdd56", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "a00b4f39", "metadata": {}, "source": [ "# Window functions are also useful to compute changes in the data for each group\n", "\n", "In this case, the window function often uses the `shift(n)` method that lags the data by `n` rows" ] }, { "cell_type": "code", "execution_count": null, "id": "e553c17f", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "f2973e3d", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "c9ca46b0", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 5 }