2024-heraklion-data/exercises/tabular_join/tabular_join.ipynb

125 lines
2.7 KiB
Plaintext
Raw Normal View History

2024-08-27 14:27:53 +02:00
{
"cells": [
{
"cell_type": "markdown",
"id": "f11a76bf",
"metadata": {},
"source": [
"# Exercise: Add experiment information to electrophysiology data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b6f2742b",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"# Set some Pandas options: maximum number of rows/columns it's going to display\n",
"pd.set_option('display.max_rows', 1000)\n",
"pd.set_option('display.max_columns', 100)"
]
},
{
"cell_type": "markdown",
"id": "2967c84e",
"metadata": {},
"source": [
"# Load electrophysiology data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "ed626ee3",
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('../../data/QC_passed_2024-07-04_collected.csv')\n",
"info = pd.read_csv('../../data/op_info.csv')"
]
},
{
"cell_type": "markdown",
"id": "2fef4d37",
"metadata": {},
"source": [
"# 1. Add experiment information to the electrophysiology results\n",
"\n",
"* Is there information for every experiment?\n",
"* How many experiments did each patcher perform? (i.e., individual OPs, or rows in `info`)\n",
"* How many samples did each patcher analyze? (i.e., individual rows in `df`)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f3f57eb",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "44031178",
"metadata": {},
"source": [
"# 2. Remove outliers from the table\n",
"\n",
"1. Load the list of outliers in `outliers.csv`\n",
"2. Use an anti-join to remove the outliers from the table\n",
"3. How many samples (rows) are left in the data?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7fa953af",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "84270332",
"metadata": {},
"source": [
"# 3. Save final result in `processed_QC_passed_2024-07-04_collected_v1.csv`\n",
"\n",
"1. Using the `.to_csv` method of Pandas DataFrames"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7bcff45",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}