add all materials

2025-08-13 13:58:07 +02:00 · 2025-08-13 13:58:07 +02:00 · 2ea0c0b60c
commit 2ea0c0b60c
parent 01e7a23ae2
43 changed files with 20448 additions and 0 deletions
--- a/parallel/README.md
+++ b/parallel/README.md
@ -0,0 +1,24 @@
+# The dangers and joys of automatic parallelization (like in numpy linear algebra routines) and the use of clusters/schedulers (but also on your laptop)
+- Go through the [notebook](../parallel.ipynb) to play around with numpy auto-parallelization, CPU affinity and OpenMP thread pool control
+
+- Now we want to submit our code to a cluster, or even just running it in parallel on our own laptop:
+  - run [`overcommit.py`](overcommit.py) while monitoring with htop
+  - try the [`submit.sh`](submit.sh) script
+  - see problems with overcomitting
+  - explain the PSI (Pressure Stalled Information) fields in `htop`. Useful readings:
+    - https://docs.kernel.org/accounting/psi.html
+    - https://facebookmicrosites.github.io/psi/docs/overview
+- Discuss implications for local and cluster workflows
+
+# Hands on
+- Let's try to make it more quantitative:
+  - Write a benchmark in the style of [benchmark_python](../benchmark_python/bench.py)
+  - We want to assess the performance of matrix multiplication as a function of:
+    - the size of the matrix `N`
+    - the number of openMP threads `T`, controlled with `threadpoolctl` or by environment variable `OMP_NUM_THREADS`
+    - the number of processes `P`, controlled by the [`submit.sh`](submit.sh) script or something similar
+- The results will of course depend on the particular architecture of the machine on which you are running
+- Submit your benchmark, together with some plotting routines, as a PR to this repo!
+
+
+