An implementation of the KatharoSeq protocol, originally defined in Minich et al 2018 mSystems and in Minich et al 2022
Installation assumes a working QIIME 2 environment with a minimum version of 2021.8. Details on installing QIIME 2 can be found here.
git clone https://github.com/biocore/q2-katharoseq.git
cd q2-katharoseq
pip install -e .
Computation assumes that the user has classified their 16S features against SILVA, and that the FeatureTable[Frequency]
has been collapsed to the genus level. Please see the q2-feature-classifier
for detail on how to perform taxonomy classification, and the q2-taxa
plugin for information on collapsing to a taxonomic level. If you need more information on how to process your data, please refer to one of the relevant tutorials that can be found here. For these examples, data from the Fish Microbiome Project (FMP): Fish microbiomes 101: disentangling the rules governing marine fish mucosal microbiomes across 101 species paper will be used, and can be found in the example
folder.
For a less stringent filter, an appropriate value for --p-threshold
would be around 50. For a more strict filter, use a value like 90, as shown below in the example.
In order to obtain a read count threshold, computation of a minimum read count threshold can be performed with the
read-count-threshold
plugin action. Test data can be found under the example
folder.
qiime katharoseq read-count-threshold \
--i-table example/fmp_collapsed_table.qza \
--m-positive-control-column-file example/fmp_metadata.tsv \
--m-positive-control-column-column control_rct \
--m-cell-count-column-file example/fmp_metadata.tsv \
--m-cell-count-column-column control_cell_into_extraction \
--p-positive-control-value control \
--p-control classic \
--p-threshold 90 \
--o-visualization result_fmp_example.qzv
Estimate the biomass of samples using KatharoSeq controls. After obtaining a read count threshold using the action above, use the same metadata and collapsed table as input. The --p-pcr-template-vol
and --p-dna-template-vol
values are numeric values that should come from your experimental procedures.
qiime katharoseq estimating-biomass \
--i-table example/fmp_collapsed_table.qza \
--m-control-cell-extraction-file example/fmp_metadata.tsv \
--m-control-cell-extraction-column control_cell_into_extraction \
--p-min-total-reads 1315 \
--p-positive-control-value control \
--m-positive-control-column-file example/fmp_metadata.tsv \
--m-positive-control-column-column control_rct \
--p-pcr-template-vol 5 \
--p-dna-extract-vol 60 \
--m-extraction-mass-g-column extraction_mass_g \
--m-extraction-mass-g-file example/fmp_metadata.tsv \
--o-estimated-biomass estimated_biomass_fmp_rct
Finally in order to visualize the results from estimating-biomass
, run biomass-plot
.
qiime katharoseq biomass-plot \
--i-table example/fmp_collapsed_table.qza \
--m-control-cell-extraction-file example/fmp_metadata.tsv \
--m-control-cell-extraction-column control_cell_into_extraction \
--p-min-total-reads 1315 \
--p-positive-control-value control \
--m-positive-control-column-file example/fmp_metadata.tsv \
--m-positive-control-column-column control_rct \
--o-visualization biomass_plot_fmp