Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified 01_materials/slides/00_introduction_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/01_probability_intro_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/02_populations_and_samples_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/03_simple_probability_samples_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/04_stratified_sampling_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/05_cluster_sampling_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/06_errors_slides.pdf
Binary file not shown.
Binary file not shown.
Binary file modified 01_materials/slides/08_ethics_slides.pdf
Binary file not shown.
Binary file modified 01_materials/slides/09_privacy_slides.pdf
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ $ echo "Data Sciences Institute"
- Consider a population of size *N*
- Simple random sampling **with replacement**:
1. Select one unit for measurement, with probability 1/ *N*
2. Sampled unit is return to the population
2. Sampled unit is returned to the population
3. Select second unit for measurement, with probability 1/ *N*
4. Repeat until desired sample size is obtained

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ $ echo "Data Sciences Institute"

- The variance of the sample mean can be computed,

> $$ \hat{V}(\bar{y}) = \sum_{h=1}^{H}\frac{s_h^2}{n_h}(1-\frac{n_h}{N_h})(\frac{N^h}{N})^2 $$
> $$ \hat{V}(\bar{y}) = \sum_{h=1}^{H}\frac{s_h^2}{n_h}(1-\frac{n_h}{N_h})(\frac{N_h}{N})^2 $$

(how much our mean will vary across samples)

Expand Down Expand Up @@ -114,7 +114,7 @@ $ echo "Data Sciences Institute"

- The population mean can be estimated directly using a weighted mean of recorded observations:

> $$ \bar{y} = \frac{\sum_{h=1}^{H} \sum_{i=1}^{h} w_{hi}y_{hi}}{\sum_{h=1}^{H} \sum_{i=1}^{h} w_{hi}} $$
> $$ \bar{y} = \frac{\sum_{h=1}^{H} \sum_{i=1}^{n_h} w_{hi}y_{hi}}{\sum_{h=1}^{H} \sum_{i=1}^{n_h} w_{hi}} $$

- In stratified sampling, we need to sum over the weights and units in each stratum, and then sum over all strata.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ $ echo "Data Sciences Institute"

- The sample variance of the PSU totals is,

> $$ s_t^2=\frac{1}{n-1}\sum_{i=1}^{N}(t_i-\frac{\hat{t}}{N})^2 $$
> $$ s_t^2=\frac{1}{n-1}\sum_{i=1}^{n}(t_i-\frac{\hat{t}}{n})^2 $$

- $s_t^2$ can then be used to compute the standard error of the estimated sample mean:

Expand Down
2 changes: 1 addition & 1 deletion 03_instructional_team/markdown_slides/09_privacy_slides.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ $ echo "Data Sciences Institute"
# Key Texts

- Salganik, M. (2019). Understanding and managing informational risk. In *Bit by bit: Social research in the Digital age* (pp. 307–314). Chapter, Princeton University Press.
- Wood, A., Altman, M., Bembenek, A., Bun, M., Gaboardi, M., Honaker, J., Nissim, K., OBrien, D.R., Steinke, T., & Vadhan, S. (2018). *[Differential privacy: A primer for a non-technical audience](https://salil.seas.harvard.edu/publications/differential-privacy-primer-non-technical-audience)* . *Vanderbilt Journal of Entertainment & Technology Law, * 21(1) 209-275.
- Wood, A., Altman, M., Bembenek, A., Bun, M., Gaboardi, M., Honaker, J., Nissim, K., O'Brien, D.R., Steinke, T., & Vadhan, S. (2018). *[Differential privacy: A primer for a non-technical audience](https://salil.seas.harvard.edu/publications/differential-privacy-primer-non-technical-audience)* . *Vanderbilt Journal of Entertainment & Technology Law, * 21(1) 209-275.

---

Expand Down
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,13 +69,11 @@ module, with access to live help. Content is not facilitated, but rather
this time should be driven by participants. We encourage participants to
come to these work periods with questions and problems to work through.
  Participants are encouraged to engage actively during the learning
module. They key to developing the core skills in each learning module
module. The key to developing the core skills in each learning module
is through practice. The more participants engage in coding along with
the instructional team, and applying the skills in each module, the more
likely it is that these skills will solidify.  

## Schedule

## Schedule

| Live Learning Session | Topic | Assignments | Resources | |
Expand Down Expand Up @@ -127,7 +125,7 @@ Feel free to use the following as resources:
- [LLN Demo](./04_this_cohort/resources/5.1_probability_lln_demo.py)
- [Amazon Exercises](./04_this_cohort/resources/amazon_exercises.pdf)
- [Multiple Imputation
Exercises](./04_this_cohort/resources/sampling_multiple_imputation_exerises.py)
Exercises](./04_this_cohort/resources/sampling_multiple_imputation_exercises.py)

### Videos

Expand Down