Taught by Jackson Luckey at the Hertie School in Berlin in summer 2025.
Before the first day, please install R and RStudio.
- Overview: Overview of what we will cover during the bootcamp.
- Git: Introduces Git to get everyone to the point that they can use this repo throughout the bootcamp.
- Intro to R: Introduces the basics of base R (objects, vectors, functions, and dataframes).
- Intro to Data Visualization: Introduces the basics of data visualization with R using ggplot2.
- Working with Dataframes with
dplyr: Covers the maindplyrverbs (functions) such asfilter(),select(),mutate(),summarize(), andgroup_by(). - Vectors and Matrices: Overview of vectors and matrices in R. Unused in the exercises but will be helpful for Math for Data Science.
- Working with Dataframes in Base R: A quick look at working with Dataframes in base R. Useful for getting a sense of what
dplyrreplaces.
On Saturday and Sunday I used some slides from an older version of the Intro to Data Science course.
On Saturday I used part of the Tidyverse slides.
On Sunday I used the version control / Git slides.
You might also look through the command line slides.
The prerequisites (unless stated otherwise) are the Intro to Data Visualization and Working with Dataframes with dplyr slides.
dplyrandggplotwith NYC Flights: Practice usingdplyrandggplot2with a dataset of all flights departing NYC in 2013.- Leader Assassination as a Natural Experiment: Practice using
dplyrandggplot2to investigate the effects of leader assassinations on democracy and war. Adapted from Quantiative Social Science: An Introduction with Tidyverse.
This course borrows liberally from Simon Munzert's Introduction to Data Science (IDS) course as the bootcamp primarily serves as preparation for the course.
I also took inspiration, examples, and exercises from: