commit 4007965045f2add354e56de95eebc7e234f527ab Author: Tiziano Zito Date: Mon Aug 19 11:42:14 2024 +0200 first commit diff --git a/README.md b/README.md new file mode 100644 index 0000000..ffea50f --- /dev/null +++ b/README.md @@ -0,0 +1,35 @@ +# git + +## Setup +- follow the instructions to create your own SSH-key pair to be used during the school: https://aspp.school/wiki/github-setup + +## Warm-Up +- how to start a repo from scratch? + - `git init` local method + - on an online forge (GitHub, GitLab, …): `git clone` +- how to revert mistakes? + - before commit: + - `git restore ` [discard changes in the working directory] __changes files__ + - `git restore --staged ` [unstage changes ➔ opposite of `git add `, does not modify the working directory] + - after commit: + - `git revert ` [creates a new commit, modifies the working directory] + - `git reset ` [only reset the HEAD pointer, does not modify the working directory] __rewrites history__ ➔ can not be used if you have already pushed to some remote + - `git reset --hard ` [reset HEAD and modify working directory] __rewrites history__ and __changes files__ ➔ can not be used if you have already pushed to some remote +- how to *move* the whole working directory to a specific point in history? + - `git checkout ` ➔ `DETACHED HEAD` problem, __changes__ __files__ + - interaction with branches: `git branch ` + `git switch ` +- `git gui`: building commits along the way interactively (for the *mess around* type of workflows) + +## The Open Source model +- remotes: `git pull `, `git push `, `git fetch `, `git merge ` +- GitHub: forks, branches and PRs: important ➔ explain fork vs. clone!!! +- strategies for keeping your fork up-to-date: your `main`, origin's and upstream's `main`, short-lived and long-lived topic branches +- a more thorough and detailed explanation can be found on the [SciPy Contributor's Guide](https://docs.scipy.org/doc/scipy/dev/gitwash/index.html). This guide can be adapted to your own needs, see [gitwash](https://github.com/matthew-brett/gitwash). +- make it clear that GitHub and GitLab are just options (git≠GitHub) + +## Scenarios +1. [lone scientist](scenarios/scenario1.png) working alone in the cellar without Internet (local git) +2. [lone scientist](scenarios/scenario2.png) uploading their software to the Internet in the hope it can be useful for other people (local git + one personal GitHub repo) +3. [lone scientist](scenarios/scenario3.png) sharing one software project with some other befriended lone scientist working in a different place (local git + one personal GitHub repo + permissions) +4. [research group](scenarios/scenario4.png) sharing software among members (local git + several GitHub repos + permissions + branches + [optional] PRs) +5. [fully distributed software development](scenarios/scenario5.png) using the most typical open source software workflows as used by numpy, scipy, sklearn, etc. (like above + we don't trust our contributors, i.e. work strictly with forks) diff --git a/exercise.md b/exercise.md new file mode 100644 index 0000000..72cfbff --- /dev/null +++ b/exercise.md @@ -0,0 +1,47 @@ +# Create a simple authentication system +*an alternative to the hopelessly boring `hello world` examples for an introduction to git* + +Start creating a script called `auth.py` + +### Expected usage: + - run the script + - the script asks for username and password + - if the user is known and password is correct ➔ print "Successfully authenticated!" + - if the user is known and password is wrong ➔ print "Wrong password!" + - if the user is not known ➔ print "Wrong username!" + - if the script is called with one argument, add a new user using the argument as a username + - if a user has been added ➔ store the updated database to disk + +### Basic API: + - a function `get_credentials` that asks for username and password + - a function `authenticate` that checks if user is in the password database and that the password is correct + - a function `add_user` to add a new user with its password to the database + - a function `read_pwdb` to read the password database from disk + - a function `write_pwdb` to write the password database to disk + +Suggestions: + - the database can be a simple dictionary `{username: password}` + - the database can be serialized to disk with [`json`](https://docs.python.org/3/library/json.html) + +### Later, think about the following problems: + - we are leaking valid usernames ➔ return a generic error if username does not exist or password is wrong + - [password *hashing*](https://en.wikipedia.org/wiki/Cryptographic_hash_function) ➔ do not store passwords in clear text (database could be stolen, admins are nosy). Solution: Do not store passwords at all but only their hashes (database could be stolen) + - [password *salting*](https://en.wikipedia.org/wiki/Salt_%28cryptography%29) ➔ different users with same passwords should not have same hash ⟶ cracking one does not crack all: mitigates dictionary attacks, see below + +Addition to the basic API: + - a function `pwhash` that given a password and a salt returns a hash + - a function `get_salt` that returns a unique salt + +### Try to crack it! (Advanced) + - can you guess the [*hash collision*](https://en.wikipedia.org/wiki/Collision_attack) risk for the proposed solution? + - try first a [*brute force*](https://en.wikipedia.org/wiki/Brute-force_attack) attack: is it feasible? + - try a [*dictionary*](https://en.wikipedia.org/wiki/Dictionary_attack) attack (you can use this list of [probable passwords](https://github.com/danielmiessler/SecLists/tree/master/Passwords)): is it feasible? + - think about [*lookup tables*](https://en.wikipedia.org/wiki/Lookup_table) and [*rainbow tables*](https://en.wikipedia.org/wiki/Rainbow_table) attacks + - what are the trade-offs of the different attacks? + +### Notes +To make it for real: + - insecure temporary file ([symlink race](https://en.wikipedia.org/wiki/Symlink_race) attack) ⟶ [`tempfile`](https://docs.python.org/3/library/tempfile.html) and its context managers + - better way of generating passwords or random tokens: the [`secrets`](https://docs.python.org/3/library/secrets.html) module + - cracking a password database is a form of art, see for example the [John the Ripper](http://www.openwall.com/john/) password cracker, or [Hashcat](https://hashcat.net/hashcat/) or [Brutus](https://www.darknet.org.uk/2006/09/brutus-password-cracker-download-brutus-aet2zip-aet2/) + diff --git a/git-commands-visualizations.pdf b/git-commands-visualizations.pdf new file mode 100644 index 0000000..8381c5e Binary files /dev/null and b/git-commands-visualizations.pdf differ diff --git a/scenarios/scenario1.png b/scenarios/scenario1.png new file mode 100644 index 0000000..46f5204 Binary files /dev/null and b/scenarios/scenario1.png differ diff --git a/scenarios/scenario2.png b/scenarios/scenario2.png new file mode 100644 index 0000000..d182bd0 Binary files /dev/null and b/scenarios/scenario2.png differ diff --git a/scenarios/scenario3.png b/scenarios/scenario3.png new file mode 100644 index 0000000..263e9e4 Binary files /dev/null and b/scenarios/scenario3.png differ diff --git a/scenarios/scenario4.png b/scenarios/scenario4.png new file mode 100644 index 0000000..051be0b Binary files /dev/null and b/scenarios/scenario4.png differ diff --git a/scenarios/scenario5.png b/scenarios/scenario5.png new file mode 100644 index 0000000..581b0a0 Binary files /dev/null and b/scenarios/scenario5.png differ