Skip to content

Commit

Permalink
motivation for version control
Browse files Browse the repository at this point in the history
  • Loading branch information
bast committed Jan 5, 2025
1 parent ec3b30b commit 2e300a0
Show file tree
Hide file tree
Showing 3 changed files with 130 additions and 1 deletion.
Binary file added content/img/gophers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion content/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ running Python scripts from the command line.

**Day 2 morning**:
- 09:00 - 09:45
- Version control
- {doc}`version-control`
- {doc}`dependencies`
- {doc}`documentation`
- 09:45 - 10:20
Expand All @@ -68,6 +68,7 @@ gallery.md
notebooks_to_scripts.md
CLI.md
testing.md
version-control.md
dependencies.md
documentation.md
profiling.md
Expand Down
128 changes: 128 additions & 0 deletions content/version-control.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Version control (motivation)

:::{objectives}
- Browse **commits** and **branches** of a Git repository.
- Remember that commits are like **snapshots** of the repository at a certain
point in time.
- Know the difference between **Git** (something that tracks changes) and
**GitHub/GitLab** (a web platform to host Git repositories).
:::


## Why do we need to keep track of versions?

Version control is an answer to the following questions (do you recognize some
of them?):

- "It broke ... hopefully I have a working version somewhere?"

- "Can you please send me the latest version?"

- "Where is the latest version?"

- "Which version are you using?"

- "Which version have the authors used in the paper I am trying to reproduce?"

- "Found a bug! Since when was it there?"

- "I am sure it used to work. When did it change?"

- "My laptop is gone. Is my thesis now gone?"


## Demonstration

- Example repository: <https://github.com/workshop-material/planets>
- Commits are like **snapshots** and if we break something we can go back to a
previous snapshot.
- Commits carry **metadata** about changes: author, date, commit message, and
a checksum.
- **Branches** are like parallel universes where you can experiment with
changes without affecting the default branch:
<https://github.com/workshop-material/planets/network>
("Insights" -> "Network")
- With version control we can **annotate code**
([example](https://github.com/workshop-material/planets/blame/main/simulate.py)).
- **Collaboration**: We can fork (make a copy on GitHub), clone (make a copy
to our computer), review, compare, share, and discuss.
- **Code review**: Others can suggest changes using pull requests or merge
requests. These can be reviewed and discussed before they are merged.
Conceptually, they are similar to "suggesting changes" in Google Docs.


## Features: roll-back, branching, merging, collaboration

- **Roll-back**: you can always go back to a previous version and compare

- **Branching and merging**:
- Work on different ideas at the same time
- You can experiment with an idea and discard it if it turns out to be a bad idea
- Different people can work on the same code/project without interfering

:::{figure} img/gophers.png
:alt: Branching explained with a gopher
:width: 100%

Image created using <https://gopherize.me/>
([inspiration](https://twitter.com/jay_gee/status/703360688618536960)).
:::

- **Collaboration**: review, compare, share, discuss

- [Example network graph](https://github.com/workshop-material/planets/network)


## Talking about code

Which of these two is more practical?
1. "Clone the code, go to the file 'simulate.py', and search for 'force_between_planets'.
Oh! But make sure you use the version from September 2024."
1. Or I can send you a permalink: <https://github.com/workshop-material/planets/blob/1343ac0/simulate.py#L31C5-L39>


## What we typically like to snapshot

- Software (this is how it started but Git/GitHub can track a lot more)
- Scripts
- Documents (plain text files much better suitable than Word documents)
- Manuscripts (Git is great for collaborating/sharing LaTeX or [Quarto](https://quarto.org/) manuscripts)
- Configuration files
- Website sources
- Data

::::{discussion}
In this example somebody tried to keep track of versions without a version
control system tool like Git. Discuss the following directory listing. What
possible problems do you anticipate with this kind of "version control":
```text
myproject-2019.zip
myproject-2020-february.zip
myproject-2021-august.zip
myproject-2023-09-19-working.zip
myproject-2023-09-21.zip
myproject-2023-09-21-test.zip
myproject-2023-09-21-myversion.zip
myproject-2023-09-21-newfeature.zip
...
(100 more files like these)
```

:::{solution}
- Giving a version to a collaborator and merging changes later with own
changes sounds like lots of work.
- What if you discover a bug and want to know since when the bug existed?
:::
::::


## Where to learn more

[CodeRefinery lessons](https://coderefinery.org/lessons/) with **focus on collaboration** and **not only for the command line**:
- [Introduction to version control](https://coderefinery.github.io/git-intro/) (day 1-2):
**Why we want to track versions and how to go back in time to a working version.**
This lesson brings you from zero to using Git and GitHub for own projects.

- [Collaborative distributed version control](https://coderefinery.github.io/git-collaborative/) (day 3):
This lesson builds on "Introduction to version control" and we apply branching and learn about pull requests
(merge requests), forks, and collaboration using Git and GitHub.

0 comments on commit 2e300a0

Please sign in to comment.