-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
88 lines (62 loc) · 5.79 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: "Cyano Interlab study"
output: rmarkdown::github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
This a repository to analyze the data from the Cyano Interlab study.
## Contents
- [Project structure](#project-structure)
- [Input data](#input-data)
- [Analysis](#analysis)
## Project structure {#project-structure}
This project contains the following directories:
- `data/`: contains raw data files
- `figures/`: contains the figures for the manuscript
- `R/`: contains all functions for data loading and processing, and figure generation
- `reports/`: contains reports to load raw data, process it and export it and to create the figures for the manuscript
- `_targets`: directory created by the [targets](https://docs.ropensci.org/targets/) package required to run he workflow
- `renv`: directory created by the [renv](https://rstudio.github.io/renv/articles/renv.html) package required to manage project dependencies
## Input data {#input-data}
Participants submitted data in standard spreadsheet templates from experimental runs and reference measurements. These are stored in`data/experiments/` and `data//reference/` respectively. In addition, two extra datasets are needed to run the analysis. `data/dilutions` contains two spreadsheets with information about the dilutions factors applied by each participant in the micro-titer plates. `data/location_labels` contains a table to match the location name in the raw files to the names that will be displayed in the figures.
## Analysis {#analysis}
The analysis of this project is done using a [targets](https://docs.ropensci.org/targets/) pipeline. This pipeline includes reading raw files, wrangling data, normalization of plate-reader measurements, and creating the figures and statistical analyses presented in the manuscript.
The code to run the pipeline is found in the `_targets.R` file and can be executed by running `targets::tar_make()`. This pipeline makes use of custom functions stored in the `R/` directory. Below, we describe every step of the process:
### Reading data
Raw data is loaded with the functions `read_dilutions()`, `read_raw_experiment_data()`, and `read_raw_reference_data()`. The first function will create a regular data frame. Tast two, will create a list-column tibble containing a row per lab, experimental run and dataset, stored in the `raw_data` column. These tibbles can be obtained by providing the corresponding path and file name pattern to each function or by using `targets::tar_load()` after running the pipeline:
```{r, echo = T}
targets::tar_load(data.experiments.raw)
data.experiments.raw
```
### Processing raw data
Two functions (`process_experiments_data()` and `process_reference_data()`) are used to process the tibbles generated in the previous step. These functions will wrangle the raw data into the right format and process the plate-reader (background-signal correction and reference-strain normalization of relative fluorescence units). The output is a list-column tibble, containing a dataset per row. These tibbles can be obtained by providing the corresponding datasets to each function or by using `targets::tar_load()` after running the pipeline:
```{r, echo = T}
targets::tar_load(data.experiments.processed)
data.experiments.processed
```
The table shows the content of each dataset:
| Dataset | Description |
|----------------------|-------------------------------------------------------------------------------------------------------------|
| `pr.fl.raw` | Long-format plate-reader fluorescence |
| `pr.od.raw` | Long-format plate-reader OD |
| `pr.fl` | Long-format, dilution corrected, plate-reader fluorescence |
| `pr.od` | Long-format, dilution corrected, plate-reader OD |
| `pr.bc` | Long-format, dilution corrected, background-signal corrected, plate-reader fluorescence |
| `pr.norm` | Long-format, dilution corrected, background-signal corrected, reference-strain normalized plate-reader RFU |
| `sp.od` | Long-format spectrophotometer OD |
| `sp.full.spectrum` | Long-format spectrophotometer full spectrum |
| `chl` | Long-format spectrophotometer chlorophyll content |
Processed datasets can be accessed with:
```{r, echo = T}
targets::tar_load(data.experiments.processed)
# example to extract spectrophotometer OD
data.experiments.processed |>
dplyr::filter(data_id == "sp.od") |>
tidyr::unnest(data) |>
dplyr::select(-data_id)
```
### Figures
The code to produce each figure in the manuscript can be found in the `R/` directory. Each figure is created with a custom function stored in a separate file. The `targets` pipeline includes these functions and the output is stored in the `figures/` directory (.tiff files are not include in this repository due to file size).
### Dependencies
We provide [`renv`](https://rstudio.github.io/renv/articles/renv.html) files to facilitate dependency management. We include the `.Rprofile` file so when cloning the repository and opening the project, `renv` should be automatically downloaded and installed and project dependencies can be restore with `renv::restore()`. See [this resource](https://rstudio.github.io/renv/articles/collaborating.html) for futher details.