2024-heraklion-data/exercises/tabular_join/tabular_join.ipynb
2024-08-27 15:27:53 +03:00

2.7 KiB

Exercise: Add experiment information to electrophysiology data

In [1]:
import pandas as pd

# Set some Pandas options: maximum number of rows/columns it's going to display
pd.set_option('display.max_rows', 1000)
pd.set_option('display.max_columns', 100)

Load electrophysiology data

In [2]:
df = pd.read_csv('../../data/QC_passed_2024-07-04_collected.csv')
info = pd.read_csv('../../data/op_info.csv')

1. Add experiment information to the electrophysiology results

  • Is there information for every experiment?
  • How many experiments did each patcher perform? (i.e., individual OPs, or rows in info)
  • How many samples did each patcher analyze? (i.e., individual rows in df)
In [ ]:

2. Remove outliers from the table

  1. Load the list of outliers in outliers.csv
  2. Use an anti-join to remove the outliers from the table
  3. How many samples (rows) are left in the data?
In [ ]:

3. Save final result in processed_QC_passed_2024-07-04_collected_v1.csv

  1. Using the .to_csv method of Pandas DataFrames
In [ ]: