Skip to content

Commit 5183660

Browse files
committed
add new vignettes
1 parent e6bc204 commit 5183660

3 files changed

+643
-0
lines changed
+198
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
---
2+
title: "basics analysis using MicrobiotaProcess"
3+
author: |
4+
| Shuangbin Xu
5+
| School of Basic Medical Sciences, Southern Medical University
6+
date: "`r Sys.Date()`"
7+
bibliography: MicrobiotaProcess.bib
8+
biblio-style: apalike
9+
output:
10+
prettydoc::html_pretty:
11+
toc: true
12+
theme: cayman
13+
highlight: vignette
14+
pdf_document:
15+
toc: true
16+
vignette: >
17+
%\VignetteIndexEntry{ MicrobiotaProcess: basics analysis using MicrobiotaProcess.}
18+
%\VignetteEngine{knitr::rmarkdown}
19+
%\usepackage[utf8]{inputenc}
20+
%\VignetteEncoding{UTF-8}
21+
---
22+
23+
```{r, echo=FALSE, results="asis", message=FALSE, KnitrSetUp}
24+
knitr::opts_chunk$set(tidy=FALSE,warning=FALSE,message=FALSE)
25+
Biocpkg <- function (pkg){
26+
sprintf("[%s](http://bioconductor.org/packages/%s)", pkg, pkg)
27+
}
28+
29+
CRANpkg <- function(pkg){
30+
cran <- "https://CRAN.R-project.org/package"
31+
fmt <- "[%s](%s=%s)"
32+
sprintf(fmt, pkg, cran, pkg)
33+
}
34+
```
35+
36+
```{r, echo=FALSE, results="hide", message=FALSE, Loadpackages}
37+
library(ggplot2)
38+
library(DT)
39+
library(tidyverse)
40+
library(phyloseq)
41+
library(ggtree)
42+
library(treeio)
43+
library(tidytree)
44+
library(MicrobiotaProcess)
45+
```
46+
47+
# 1. Introduction
48+
49+
`MicrobiotaProcess` is an R package for analysis, visualization and biomarker discovery of microbial datasets. It supports the import of microbiome census data, calculating alpha index and provides functions to visualize rarefaction curves. Moreover, it also supports visualizing the abundance of taxonomy of samples. And It also provides functions to perform the `PCA`, `PCoA` and hierarchical cluster analysis. In addition, `MicrobiotaProcess` also provides a method for the biomarker discovery of metagenome or other datasets.
50+
51+
# 2. `MicrobiotaProcess` profiling
52+
53+
## 2.1 import function
54+
55+
`MicrobiotaProcess` has an feature to import phylogenetic sequencing data from `r Biocpkg("dada2")`[@callahan2016dada2] and [qiime2](https://qiime2.org/)[@bolyen2019qiime2] taxonomic clustering pipelines. The resulting object after import is `r Biocpkg("phyloseq")` object[@Paul2013phyloseq]
56+
57+
```{r, error=TRUE, importFunction}
58+
# import data from dada2 pipeline.
59+
seqtabfile <- system.file("extdata", "seqtab.nochim.rds", package="MicrobiotaProcess")
60+
seqtab <- readRDS(seqtabfile)
61+
taxafile <- system.file("extdata", "taxa_tab.rds",package="MicrobiotaProcess")
62+
seqtab <- readRDS(seqtabfile)
63+
taxa <- readRDS(taxafile)
64+
# the seqtab and taxa are output of dada2
65+
sampleda <- system.file("extdata", "mouse.time.dada2.txt", package="MicrobiotaProcess")
66+
ps_dada2 <- import_dada2(seqtab=seqtab, taxatab=taxa, sampleda=sampleda)
67+
ps_dada2
68+
69+
# import data from qiime2 pipeline
70+
otuqzafile <- system.file("extdata", "table.qza", package="MicrobiotaProcess")
71+
taxaqzafile <- system.file("extdata", "taxa.qza", package="MicrobiotaProcess")
72+
mapfile <- system.file("extdata", "metadata_qza.txt", package="MicrobiotaProcess")
73+
ps_qiime2 <- import_qiime2(otuqza=otuqzafile, taxaqza=taxaqzafile, mapfilename=mapfile)
74+
ps_qiime2
75+
```
76+
77+
## 2.2 Rarefaction visualization
78+
79+
Rarefaction, based on sampling technique, was used to compensate for the effect of sample size on the number of units observed in a sample[@siegel2004rarefaction]. `MicrobiotaProcess` provided `ggrarecurve` to plot the curves, based on `rrarefy` of `r CRANpkg("vegan")`[@Jari2019vegan].
80+
81+
```{r, error=TRUE, fig.align="center", fig.height=3.2, fig.width=7.5, rarefaction}
82+
# for reproducibly random number
83+
set.seed(1024)
84+
p_rare <- ggrarecurve(obj=ps_dada2,
85+
indexNames=c("Observe","Chao1","ACE"),
86+
chunks=300) +
87+
theme(legend.spacing.y=unit(0.02,"cm"),
88+
legend.text=element_text(size=6))
89+
p_rare
90+
91+
```
92+
93+
## 2.3 calculate alpha index and visualization
94+
95+
`MicrobiotaProcess` provides `get_alphaindex` to calculate alpha index and the `ggbox` to visualize the result
96+
97+
```{r, error=TRUE, fig.width=7.5, fig.height=3.2, fig.align="center", alphaindex}
98+
alphaobj <- get_alphaindex(ps_dada2)
99+
head(as.data.frame(alphaobj))
100+
p_alpha <- ggbox(alphaobj, geom="violin", factorNames="time") +
101+
scale_fill_manual(values=c("#2874C5", "#EABF00"))+
102+
theme(strip.background = element_rect(colour=NA, fill="grey"))
103+
p_alpha
104+
```
105+
106+
## 2.4 The visualization of taxonomy abundance
107+
108+
`MicrobiotaProcess` presents the `ggbartax` for the visualization of composition of microbial communities.
109+
110+
```{r, error=TRUE, fig.align="center", fig.height=4.5, fig.width=7, taxabar}
111+
# relative abundance
112+
otubar <- ggbartax(obj=ps_dada2) +
113+
xlab(NULL) +
114+
ylab("relative abundance (%)")
115+
otubar
116+
```
117+
118+
If you want to get the abundance of specific levels of class, You can use `get_taxadf` then use `ggbartax` to visualize.
119+
120+
```{r, error=TRUE, fig.align="center", fig.height=4.5, fig.width=7, phylumAbundance}
121+
phytax <- get_taxadf(obj=ps_dada2, taxlevel=2)
122+
phybar <- ggbartax(obj=phytax) +
123+
xlab(NULL) + ylab("relative abundance (%)")
124+
phybar
125+
```
126+
127+
Moreover, the abundance (count) of taxonomy also can be visualized by setting count to TRUE, and the facet of plot can be showed by setting the facetNames.
128+
129+
```{r, error=TRUE, fig.align="center", fig.height=4.5, fig.width=7, classAbundance}
130+
phybar2 <- ggbartax(obj=phytax, facetNames="time", count=TRUE) + xlab(NULL) + ylab("abundance")
131+
phybar2
132+
classtax <- get_taxadf(obj=ps_dada2, taxlevel=3)
133+
classbar <- ggbartax(obj=classtax, facetNames="time", count=FALSE) +
134+
xlab(NULL) + ylab("relative abundance (%)")
135+
classbar
136+
```
137+
138+
## 2.5 PCA and PCoA analysis
139+
140+
`PCA` (Principal component analysis) and `PCoA` (Principal Coordinate Analysis) are general statistical procedures to compare groups of samples. And `PCoA` can based on the phylogenetic or count-based distance metrics, such as `Bray-Curtis`, `Jaccard`, `Unweighted-UniFrac` and `weighted-UniFrac`. `MicrobiotaProcess` presents the `get_pca`, `get_pcoa` and `ggordpoint` for the analysis.
141+
142+
```{r, error=TRUE, fig.align="center", fig.height=4.5, fig.width=6, ordanalysis}
143+
pcares <- get_pca(obj=ps_dada2, method="hellinger")
144+
# Visulizing the result
145+
pcaplot <- ggordpoint(obj=pcares, biplot=TRUE, speciesannot=TRUE,
146+
factorNames=c("time"), ellipse=TRUE) +
147+
scale_color_manual(values=c("#2874C5", "#EABF00"))
148+
pcaplot
149+
150+
pcoares <- get_pcoa(obj=ps_dada2, distmethod="euclidean", method="hellinger")
151+
# Visualizing the result
152+
pcoaplot <- ggordpoint(obj=pcoares, biplot=TRUE, speciesannot=TRUE,
153+
factorNames=c("time"), ellipse=TRUE) +
154+
scale_color_manual(values=c("#2874C5", "#EABF00"))
155+
pcoaplot
156+
```
157+
158+
## 2.6 Hierarchical cluster analysis
159+
160+
`beta diversity` metrics can assess the differences between microbial communities. It can be visualized with `PCA` or `PCoA`, this can also be visualized with hierarchical clustering. `MicrobiotaProcess` also implements the analysis based on ggtree[@yu2017ggtree].
161+
162+
```{r, fig.align="center", fig.height=5, fig.width=6, error=TRUE, hclustAnalysis}
163+
hcsample <- get_clust(obj=ps_dada2, distmethod="euclidean",
164+
method="hellinger", hclustmethod="average")
165+
# rectangular layout
166+
clustplot1 <- ggclust(obj=hcsample,
167+
layout = "rectangular",
168+
pointsize=1,
169+
fontsize=0,
170+
factorNames=c("time")) +
171+
scale_color_manual(values=c("#2874C5", "#EABF00")) +
172+
theme_tree2(legend.position="right",
173+
plot.title = element_text(face="bold", lineheight=25,hjust=0.5))
174+
clustplot1
175+
# circular layout
176+
clustplot2 <- ggclust(obj=hcsample,
177+
layout = "circular",
178+
pointsize=1,
179+
fontsize=2,
180+
factorNames=c("time")) +
181+
scale_color_manual(values=c("#2874C5", "#EABF00")) +
182+
theme(legend.position="right")
183+
clustplot2
184+
```
185+
186+
# 3. Need helps?
187+
188+
If you have questions/issues, please visit [github issue tracker](https://github.com/YuLab-SMU/MicrobiotaProcess/issues).
189+
190+
# 4. Session information
191+
192+
Here is the output of sessionInfo() on the system on which this document was compiled:
193+
194+
```{r, echo=FALSE}
195+
sessionInfo()
196+
```
197+
198+
# 5. References

0 commit comments

Comments
 (0)