diff --git a/docs/404.html b/docs/404.html index 8fdcb1e0..660024e4 100644 --- a/docs/404.html +++ b/docs/404.html @@ -71,7 +71,7 @@
@@ -94,7 +94,10 @@ diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index d08dc1ee..987bface 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -71,7 +71,7 @@ @@ -94,7 +94,10 @@ diff --git a/docs/articles/index.html b/docs/articles/index.html index e83e97a6..f9f77fe4 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -71,7 +71,7 @@ @@ -94,7 +94,10 @@ @@ -127,7 +130,9 @@vignettes/qc_metabolomics.Rmd
+ qc_metabolomics.Rmd
First, download and install R and RStudio:
+ +Then, open RStudio and install the devtools
package
install.packages("devtools")
Finally, install the MotrpacBicQC
package
library(devtools) +devtools::install_github("MoTrPAC/MotrpacBicQC", build_vignettes = TRUE)
Load the library
+library(MotrpacBicQC)
And run any of the following tests to check that the package is correctly installed and it works. For example:
+# Just copy and paste in the RStudio terminal + +check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named") +check_metadata_samples(df = metadata_sample_named, cas = "umichigan") +check_results(r_m = results_named, m_s = metadata_sample_named, m_m = metadata_metabolites_named)
which should generate the following output:
+check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named") +#> + (+) All required columns present +#> + (+) {metabolite_name} OK +#> + (+) {refmet_name} unique values: OK +#> + (+) {refmet_name} ids found in refmet: OK +#> + (+) {rt} all numeric: OK +#> + (+) {mz} all numeric: OK +#> + (+) {neutral_mass} all numeric values OK +#> + (+) {formula} available: OK +check_metadata_samples(df = metadata_sample_named, cas = "umichigan") +#> + (+) {sample_id} seems OK +#> + (+) {sample_type} seems OK +#> + (+) {sample_order} is numeric +#> + (+) {sample_order} unique values OK +#> + (+) {raw_file} unique values OK +check_results(r_m = results_named, m_s = metadata_sample_named, m_m = metadata_metabolites_named) +#> + (+) All samples from [results_metabolite] are available in [metadata_sample] +#> + (+) {metabolite_name} is identical in both [results] and [metadata_metabolites] files: OK +#> + (+) {sample_id} columns are numeric: OK
Two approaches available:
+PROCESSED_YYYYMMDD
folder (recommended)Run test on the full submission. For that, run the following command:
+validate_metabolomics(input_results_folder = "/full/path/to/PROCESSED_YYYYMMDD", + cas = "your_site_code")
cas is one of the followings:
+# Open the metadata_metabolites file(s) + +metadata_metabolites_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) +metadata_metabolites_unnamed <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named") +check_metadata_metabolites(df = metadata_metabolites_unnamed, name_id = "unnamed")
# Open your files +metadata_sample_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) +metadata_sample_unnamed <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_metadata_samples(df = metadata_sample_named, cas = "your_side_id") +check_metadata_samples(df = metadata_sample_unnamed, cas = "your_side_id")
# Open your files +metadata_metabolites_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) +metadata_sample_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) +results_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_results(r_m = results_named, + m_s = metadata_sample_named, + m_m = metadata_metabolites_named)
The following functions enable merging all results and metadata files into a single data frame.
+The folder/file structure of a required untargeted metabolomics submission is as follows:
+PASS1A-06/
+ T55/
+ HILICPOS/
+ BATCH1_20190725/
+ RAW/
+ Manifest.txt
+ file1.raw
+ file2.raw
+ etc
+ PROCESSED_20190725/
+ metadata_failedsamples_[cas_specific_labeling]. txt
+ NAMED/
+ results_metabolites_named_[cas_specific_labeling].txt
+ metadata_metabolites_named_[cas_specific_labeling].txt
+ metadata_sample_named_[cas_specific_labeling].txt
+ metadata_experimentalDetails_named_[cas_specific_labeling].txt
+ UNNAMED/
+ results_metabolites_unnamed_[cas_specific_labeling].txt
+ metadata_metabolites_unnamed_[cas_specific_labeling].txt
+ metadata_sample_unnamed_[cas_specific_labeling].txt
+ metadata_experimentalDetails_unnamed_[cas_specific_labeling].txt
+With the following file relations…
+To merge all data available in a PROCESSED_YYYYMMDD
folder, run the following command:
t31_ionpneg <- combine_metabolomics_batch(input_results_folder = "/full/path/to/PROCESSED_YYYYMMDD/", + cas = "umichigan")
Alternatively, each individual dataset can also be provided. For example:
+plasma.untargeted.merged <- + merge_all_metabolomics(m_m_n = metadata_metabolites_named, + m_m_u = metadata_metabolites_unnamed, + m_s_n = metadata_sample_named, + r_n = results_named, + r_u = results_unnamed, + phase = "PASS1A-06")
Check the function help for details
+Additional details for each function can be found by typing, for example:
+?merge_all_metabolomics
Need extra help? Please, submit an issue here providing as many details as possible.
+vignettes/qc_proteomics.Rmd
+ qc_proteomics.Rmd
First, download and install R and RStudio:
+ +Then, open RStudio and install the devtools
package
install.packages("devtools")
Finally, install the MotrpacBicQC
package
library(devtools) +devtools::install_github("MoTrPAC/MotrpacBicQC", build_vignettes = TRUE)
Load the library
+library(MotrpacBicQC)
And run any of the following tests to check that the package is correctly installed and it works. For example:
+# Just copy and paste in the RStudio terminal +test <- check_ratio_proteomics(df_ratio = metadata_metabolites_named, + isPTM = TRUE) +test <- check_rii_proteomics(df_rri = metadata_metabolites_named, + isPTM = TRUE) +test <- check_vial_metadata_proteomics(df_vm = metadata_metabolites_named)
which should generate the following outputs:
+- (-) The following required columns are missed: ptm_id, protein_id, gene_symbol, entrez_id
+- (-) The following required columns are missed: protein_id, sequence, ptm_id, ptm_peptide, gene_symbol, entrez_id, confident_score, confident_site
+- (-) The following required columns are missed: vial_label, tmt_plex, tmt11_channel
+Two approaches available:
+RESULTS_YYYYMMDD
folder (recommended)|-- PASS1B-06
+| |-- T55
+| | |-- PROT_PH
+| | | `-- BATCH1_20200312
+| | | |-- RAW_20200312
+| | | | |-- 01MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | | |-- 01MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | | `-- 01MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | | |-- 02MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | | |-- 02MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | | `-- 02MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | | |-- 03MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | | |-- 03MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | | `-- 03MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | | |-- 04MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | | |-- 04MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | | `-- 04MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | | |-- 05MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | | |-- 05MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | | `-- 05MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | | `-- 06MOTRPAC_PASS1B-06_T55_PH_PN_20191231
+| | | | |-- 06MOTRPAC_PASS1B-06_T55_PH_PN_201912319_MANIFEST.txt
+| | | | `-- 06MOTRPAC_PASS1B-06_T55_PH_PN_201912319_TMTdetails.txt
+| | | |-- RESULTS_20200909
+| | | | |-- MOTRPAC_PASS1B-06_T55_PH_PN_20200909_results_RII-peptide.txt
+| | | | |-- MOTRPAC_PASS1B-06_T55_PH_PN_20200909_results_ratio.txt
+| | | | `-- MOTRPAC_PASS1B-06_T55_PH_PN_20200909_vial_metadata.txt
+| | | `-- file_manifest_20200910.csv
+Run test on the full submission. For that, run the following command:
+validate_proteomics(input_results_folder = "/full/path/to/RESULTS_YYYYMMDD", + cas = "your_site_code") + +# in the example above... +validate_proteomics(input_results_folder = "/full/path/to/PASS1B-06/T55/PROT_PH/BATCH1_20200312/RESULTS_20200909", + cas = "pnnl", + isPTM = TRUE, + return_n_issues = FALSE)
cas is one of the followings:
+# Open the ratio results file + +proteomics_ratio_results <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_ratio_proteomics(df_ratio = proteomics_ratio_results, + isPTM = TRUE, + printPDF = FALSE)
# Open your files +proteomics_ratio_results <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_rii_proteomics(df_rri = proteomics_ratio_results, cas = "your_side_id")
# Open your files +vial_metadata <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE) + +check_vial_metadata_proteomics(df_vm = vial_metadata)
Additional details for each function can be found by typing, for example:
+?check_vial_metadata_proteomics
Need extra help? Please, submit an issue here providing as many details as possible.
+Follow this link to learn how to test and use this package.
+Alternatively, once the package is installed, run the following command to access the same documentation:
browseVignettes("MotrpacBicQC")
Follow this link for details of all the available functions
diff --git a/docs/notes_developers.html b/docs/notes_developers.html index e9e69271..24f4d1fb 100644 --- a/docs/notes_developers.html +++ b/docs/notes_developers.html @@ -71,7 +71,7 @@check whether the proteomics ratio results files is following guidelines
+check_ratio_proteomics( + df_ratio, + isPTM, + f_proof = TRUE, + output_prefix = "ratio-file", + out_qc_folder = NULL, + printPDF = TRUE, + return_n_issues = FALSE, + verbose = TRUE +)+ +
df_ratio | +(data.frame) proteomics ratio results data frame (required) |
+
---|---|
isPTM | +(logical) |
+
f_proof | +(logical) |
+
output_prefix | +(char) if |
+
out_qc_folder | +(char) if |
+
printPDF | +(logical) if |
+
return_n_issues | +(logical) if |
+
verbose | +(logical) |
+
(int) number of issues identified
+ ++{ +test <- check_ratio_proteomics(df_ratio = metadata_metabolites_named, +isPTM = TRUE, return_n_issues = TRUE, verbose = FALSE) +# "test" should be NULL +}
R/proteomics_qc.R
+ check_rii_proteomics.Rd
check whether the proteomics rri results files is following guidelines
+check_rii_proteomics( + df_rri, + isPTM, + f_proof = TRUE, + output_prefix = "rii-file", + out_qc_folder = NULL, + return_n_issues = FALSE, + printPDF = TRUE, + verbose = TRUE +)+ +
df_rri | +(data.frame) proteomics rri data frame (required) |
+
---|---|
isPTM | +(logical) |
+
f_proof | +(logical) |
+
output_prefix | +(char) if |
+
out_qc_folder | +(char) if |
+
return_n_issues | +(logical) if |
+
printPDF | +(logical) if |
+
verbose | +(logical) |
+
(int) number of issues identified
+ ++{ +test <- check_rii_proteomics(df_rri = metadata_metabolites_named, +isPTM = TRUE, return_n_issues = TRUE, verbose = FALSE) +# "test" should be NULL +}
R/proteomics_qc.R
+ check_vial_metadata_proteomics.Rd
check whether the proteomics rri results files is following guidelines
+check_vial_metadata_proteomics(df_vm, return_n_issues = FALSE, verbose = TRUE)+ +
df_vm | +(data.frame) proteomics vial_label data frame (required) |
+
---|---|
return_n_issues | +(logical) if |
+
verbose | +(logical) |
+
(int) number of issues identified
+ ++{ +test <- check_vial_metadata_proteomics(df_vm = metadata_metabolites_named, +return_n_issues = TRUE, verbose = FALSE) +# "test" should be NULL +}
R/metabolomics_qc.R
+ R/misc.R
filter_required_columns.Rd
it returns a data frame with only the required columns for metadata_metabolites
+it returns a data frame with only the required columns for metabolomics and proteomics
filter_required_columns( df, - type = c("m_m", "m_s"), + type = c("m_m", "m_s", "v_m"), name_id = NULL, verbose = TRUE )@@ -148,6 +151,7 @@
(char) Type of file to filter columns:
m_m
: metadata metabolites
m_s
: metadata samples
v_m
: proteomics vial_metadata
check proteomics ratio file
check results
check proteomics reported ion intensity file
check proteomics vial metadata file
filter required metadata_metabolites columns only
filter required columns only
Validate a Proteomics submissions
Validate a Proteomics submission
+validate_proteomics( + input_results_folder, + isPTM, + cas, + dmaqc_shipping_info = NULL, + f_proof = FALSE, + out_qc_folder = NULL, + return_n_issues = TRUE, + full_report = FALSE, + printPDF = TRUE, + verbose = TRUE +)+ +
input_results_folder | +(char) path to the PROCESSED folder to check |
+
---|---|
isPTM | +(logical) |
+
cas | +(char) CAS code |
+
dmaqc_shipping_info | +(char) phase code |
+
f_proof | +(char) print out pdf with charts including:
|
+
out_qc_folder | +(char) if f_proof is TRUE, then a folder must be provided |
+
return_n_issues | +(logical) if |
+
full_report | +(logical) if |
+
printPDF | +(logical) if |
+
verbose | +(logical) |
+
(data.frame) Summary of issues
+ +