This is a QIIME 2 wrapper for mOTU-tool. The tool will help you to assign taxonomy to your metagenomic samples. For details on QIIME 2, see https://qiime2.org.
mOTU
is an attempt to build a taxonomy utilizing genomic information about organisms formalized with the help of differences in 40 universal gene markers (40 MGs) sequences. The basic unit of taxonomical profile is an
mOTU
, and thus is used in the output. It is different from classical taxa and may encompass one, few or no species. A detailed map of relationship between mOTUs
and standard taxonomical units is located in data/motus_taxonomy_map.tsv
- QIIME 2 >= 2022.8 (https://qiime2.org/)
- Git
QIIME2 makes the copy of data to a temporary directory. By default, it's located in the /tmp
folder, which may not have enough space to store the data. Please, change the TMPDIR
variable to the folder with enough space during data import.
export TMPDIR=/path/to/tmpdir
Follow the instructions on https://docs.qiime2.org/2022.8/install/native/ to install QIIME 2. You will need to install the latest version of QIIME 2 (2022.8 or later).
Activate the QIIME 2 environment by running the following command:
conda activate qiime2-2022.8
Make sure to start by installing mamba in your QIIME2 environment. This will help to solve dependency conflicts faster:
conda activate qiime2-2022.8
conda install mamba -c conda-forge
Next, install q2-mOTUs
git clone https://github.com/motu-tool/q2-mOTUs
cd q2-mOTUs
make install
Test the installation
qiime dev refresh-cache
qiime motus --help
The plugin executes one function - assigns taxonomy to metagenomic reads. Therefore, there is a single workflow.
Import your metagenomic sequencing data in .fastq
format (don't forget to preprocess your data) to QIIME2 as a SampleData
semantic type using manifest file. See examples in q2_motus/tests/data
.
Whether you have a single sample or multiple samples, you can run mOTU-tool using the following command:
qiime motus profile \
--i-samples q2_motus/tests/data/paired-end.qza \
--o-taxonomy paired-end-taxonomy.qza \
--o-table paired-end-classified.qza \
--p-threads 4 \
--p-jobs 2
q2-mOTUs
runs multiple instances of motu profile
command from original software, which aligns reads to the reference using bwa mem
. Alignment step execution time scales effectively (linearly) for up to 8 threads per job. The amount of jobs you can deploy is amount of CPUs available divided by number of threads used for a single job.
Attention: precomputed mOTU table should be generated from full taxonomy -q
flag and counts -c
flag profiles.
qiime motus import-table \
--i-motus-table $TMPDIR/merged.motus \
--o-table artifacts/motu-table.qza \
--o-taxonomy artifacts/motu-taxonomy.qza
table
-FeatureTable[Frequency]
- A table of the counts of gene markers in samples.taxonomy
-FeatureData[Taxonomy]
- A full taxonomy for each of the gene marker.
Because table
is a FeatureTable[Frequency]
artifact, QIIME2 offers a lot of possibilities to analyze it. For example, use feature-table summarize
:
qiime feature-table summarize \
--i-table paired-end-classified.qza \
--o-visualization paired-end-classified.qzv
To get summary of your feature table.
Or create all-time favourite taxa barplot:
qiime taxa barplot \
--i-table paired-end-classified.qza \
--i-taxonomy paired-end-taxonomy.qza \
--o-visualization paired-end-taxa-barplot.qzv
Or analyze the samples using Metadata
you have on hand!
Due to a QIIME2 naming convention, parameter names in plugin and standalone version are different. The table summarizes differences.
Q2-mOTUs parameter | mOTU parameter | Description |
---|---|---|
--p-min-alen |
-l |
Minimum length of the alignment. |
--p-marker-gene-cutoff |
-g |
Minimum number of marker genes to be considered a species. Ranges from 1 to 10. A higher value increases precision (and lowers recall). |
--p-mode |
-y |
The mode to use for abundance estimation. base.coverage measures the average base coverage of the gene. insert.raw_counts measures the number of reads that map to the gene. insert.scaled_counts measures the number of reads that map to the gene, scaled by the length of the gene. |
--p-reference-genomes / --p-no-reference-genomes |
-e |
Only use species with reference genomes (ref-mOTUs). |
--p-threads |
-t |
Number of threads to use. |
--p-jobs |
-j |
Number of jobs to run in parallel. |
If you use this tool, please cite the following paper:
@article{Ruscheweyh2022,
doi = {10.1186/s40168-022-01410-z},
url = {https://doi.org/10.1186/s40168-022-01410-z},
year = {2022},
month = dec,
publisher = {Springer Science and Business Media {LLC}},
volume = {10},
number = {1},
author = {Hans-Joachim Ruscheweyh and Alessio Milanese and Lucas Paoli and
Nicolai Karcher and Quentin Clayssen and Marisa Isabell Keller and
Jakob Wirbel and Peer Bork and Daniel R. Mende and Georg Zeller and
Shinichi Sunagawa},
title = {Cultivation-independent genomes greatly expand taxonomic-profiling
capabilities of {mOTUs} across various environments},
journal = {Microbiome}
}