diff --git a/README.md b/README.md index 2e248f1..56967dd 100644 --- a/README.md +++ b/README.md @@ -59,6 +59,17 @@ Setup: ## Back to the Python benchmark (third try) - can we explain what is happening now? Yes, more or less ;-) + - the default memeory layout is also called row-major == `C_CONTIGUOUS` + - rule of thumb for multi-dimensional numpy arrays: + - the right-most index should be the inner-most loop in a series of nested loops over the dimensions of a multi-dimensional array + - the previous rule can be remembered as *the right-most index changes the faster* in a series of nested loops + - the logically contiguous data, for example the data points of a single time series, should be stored along the right-most dimension: + ```python + x = np.zeros((n_series, lenght_of_one_series)) # ➔ good! + y = np.zeros((length_of_one_series, n_series)) # ➔ bad! + ``` + - … unless of course you plan to mostly loop *across* time series :) + - watch out when migrating code from MATLAB® or to `pandas.DataFrame` ➔ they store data in memory using the opposite convention, the column-major order!!! - quick fix for the [puzzle](puzzle.ipynb): try and add `order='F'` in the "bad" snippet and see that is "fixes" the bug ➔ why? Notes on the [Python benchmark](benchmark_python/):