Option manual#
Here, we provide a full overview of mOTUs commands and their options, as well as recommendations on parameter selection.
In the command line, you can always type motus <command> to obtain a short description of its options and useage.
To execute the tool you need to call motus <command> [options], where <command> can be:
profile, perform taxonomic profiling on a sample. The command has the following subroutines:map_tax, map reads to the marker gene database, output a SAM/BAM file,calc_mgc, aggregate reads from the same marker gene cluster (MGC) and output the MGC abundance table. It uses the SAM/BAM file produced bymap_tax,calc_motu, from an MGC abundance table (created bycalc_mgc), produce the mOTUs abundance table.
downloadMGDB, download the mOTUs marker gene database,merge, merge multiple sample profiles into a single table,classify, annotate for your genomes which mOTUs they belong to,prep_long, convert long read data into an input format suitable formotus profile.genomes, search the mOTUs genome database for genomes matching specified taxonomic and functional annotations,download, download sequence data for any genome within the mOTUs database.
motus profile#
Produces a taxonomic profile from short read metagenomic sequencing data by executing motus map_tax, motus calc_mgc, and motus calc_motu in succession. Can be used to profile long read metagenomic sequencing data after running motus prep_long first.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The profile command is the main function in mOTUs that executes map_tax, calc_mgc,
and calc_motu in sequence. It takes short read metagenomic sequencing data as input
and generates a taxonomic profile.
Usage:
motus profile -f FILE [FILE ...] -r FILE [FILE ...] -s FILE [FILE ...] -o FILE [options]
motus profile -f FILE [FILE ...] -r FILE [FILE ...] -o FILE [options]
motus profile -s FILE [FILE ...] -o FILE [options]
Input options:
-f, --forward FILE [FILE ...]
Input file(s) for reads in forward orientation, FastQ/A(.gz)-formatted
-r, --reverse FILE [FILE ...]
Input file(s) for reads in reverse orientation, FastQ/A(.gz)-formatted
-s, --single FILE [FILE ...]
Input file(s) for unpaired reads, FastQ/A(.gz)-formatted
-n, --sample-name STR
Sample name (default: 'unnamed sample')
Output options:
-o, --output-file FILE
Output file name [required]
Algorithm options:
-g, --marker-genes INT
Required number of marker genes for a mOTU to be called present:
1=higher recall, 6=higher precision, 10=maximum (default: 3)
-l, --alignment-length INT
Minimum length of the alignment (bp) (default: 75)
-t, --threads INT
Number of threads (default: 1)
-y, --counting-mode STR
Which scale the abundances are reported in (default: INSERT_SCALED)
Choices: [INSERT_RAW, INSERT_NORM, INSERT_SCALED, BASE_RAW, BASE_NORM]
motus map_tax#
Maps short reads against the mOTUs marker gene database.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The map_tax command takes short read metagenomic sequencing data as input and
maps reads to the mOTUs marker gene database.
Usage:
motus map_tax -f FILE [FILE ...] -r FILE [FILE ...] -s FILE [FILE ...] -o FILE [options]
motus map_tax -f FILE [FILE ...] -r FILE [FILE ...] -o FILE [options]
motus map_tax -s FILE [FILE ...] -o FILE [options]
Input options:
-f, --forward FILE [FILE ...]
Input file(s) for reads in forward orientation, FastQ/A(.gz)-formatted
-r, --reverse FILE [FILE ...]
Input file(s) for reads in reverse orientation, FastQ/A(.gz)-formatted
-s, --single FILE [FILE ...]
Input file(s) for unpaired reads, FastQ/A(.gz)-formatted
Output options:
-o, --output-file FILE
Output file name [required]
Algorithm options:
-l, --alignment-length INT
Minimum length of the alignment (bp) (default: 75)
-t, --threads INT
Number of threads (default: 1)
Required arguments#
Input -f, -r, -s: one or multiple FastQ/A files, which can be gzipped. Ensure the order of input files is the same for -f and -r if using paired-end data.
Output -o: path to the output file.
Option overview#
Option |
Description |
|---|---|
|
Input path - Paired Forward: One or more gzipped FastQ/A files containing forward reads. The input files must have the same order for both forward and reverse. |
|
Input path - Paired Reverse: One or more gzipped FastQ/A files containing reverse reads. The input files must have the same order for both forward and reverse. |
|
Input path - Single: One or more gzipped FastQ/A files. The order of the input files does not matter for single-end files. |
|
Output prefix: Path to the output files. This prefix is also used for intermediate files. |
|
Length: Filter alignments if their length is below this value. Default value is |
|
Threads: Number of threads to use for the alignment step. Default is |
motus calc_mgc#
Calculates number of inserts mapping to each marker gene cluster within the mOTUs marker gene database.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The calc_mgc command takes a file storing the alignments of sequencing reads
to the mOTUs marker gene database and calculates marker gene cluster abundances.
Usage:
motus calc_mgc -i FILE -o FILE [options]
Input options:
-i, --input-file FILE
Path to BAM file generated after running the motus map_tax command [required]
Output options:
-o, --output-file FILE
Output file name [required]
Algorithm options:
-l, --alignment-length INT
Minimum length of the alignment (bp) (default: 75)
Required arguments#
Input -i: a SAM or BAM file generated after running motus map_tax.
Output -o: path to the output file.
Option overview#
Option |
Description |
|---|---|
|
Input path: Path to BAM file generated after running |
|
Output path: Path to the output file. |
|
Length: Filter alignments if their length is below this value. Default value is |
motus calc_motu#
Calculates the taxonomic profile based on number of inserts mapped to the corresponding marker gene clusters.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The calc_motu command takes a file containing marker gene cluster
abundances and generates a taxonomic profile.
Usage:
motus calc_motu -i FILE -o FILE [options]
Input options:
-i, --input-file FILE
MGC abundance table generated by the calc_mgc command [required]
-n, --sample-name STR
Sample name (default: 'unnamed sample')
Output options:
-o, --output-file FILE
Output file name [required]
Algorithm options:
-g, --marker-genes INT
Required number of marker genes for a mOTU to be called present:
1=higher recall, 6=higher precision, 10=maximum (default: 3)
-y, --counting-mode STR
Which scale the abundances are reported in (default: INSERT_SCALED)
Choices: [INSERT_RAW, INSERT_NORM, INSERT_SCALED, BASE_RAW, BASE_NORM]
Required arguments#
Input -i: the MGC file generated after running motus calc_mgc.
Output -o: path to the taxonomic profiles containing abundances as counts and as relative abundances.
Option overview#
Option |
Description |
|---|---|
|
Input path: Path to MGC abundance table generated after running |
|
Output path: Path to the output files. |
|
Sample name: Name of the sample. Required when merging samples. The default value is |
|
Sensitivity: The number of marker genes with abundance required to call a mOTU present. Default value is |
|
Counting method: mOTUs can count in different modes, the default mode is |
motus downloadMGDB#
Downlads the marker gene reference database required for profiling.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The downloadMGDB command downloads the marker gene reference database used
by the profile and map_tax commands.
Usage:
motus downloadMGDB [options]
Options:
-f, --force
Force download even when database is already present
Option overview#
Option |
Description |
|---|---|
|
Force: Download the database even if it’s already present. |
motus merge#
Merges taxonomic profiles from multiple samples into one (tab-separated) table.
Note: Requires that each profile is named (using -n in motus profile or motus calc_motu).
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The merge command takes multiple profiles produced after running the
profile command and combines them into a single table.
Usage:
motus merge -i FILE [FILE ...] -o FILE
Input options:
-i, --input-files FILE [FILE ...]
A list of mOTUs profile files or a text file containing the list of profile
files to be merged, with one line per file [required]
Output options:
-o, --output-file FILE
Output file name [required]
Required arguments#
Input -i: mOTUs profile files to merge. The input can be provided as a text file with one line per profile or as a space-separated list containing multiple mOTUs profiles. At least two profiles are required. Note: These files must be generated using the same mOTUs version and the same parameters.
Output -o: path to the merged profile file.
Option overview#
Option |
Description |
|---|---|
|
Input files: A space-separated list of profiles produced after running |
|
Output path: Path to the output file containing merged profiles. |
Option description#
Input files#
Option |
Description |
|---|---|
|
Input files: A space-separated list of profiles produced after running |
We will use samples from project PRJEB52368 as an example. After running motus profile on selected samples, we obtain the following output files:
SAMEA5998847.mOTUs4, SAMEA6009611.mOTUs4, SAMEA6009792.mOTUs4, SAMEA6009843.mOTUs4, and SAMEA6009888.mOTUs4.
IMPORTANT: When running motus profile, sample names should be indicated using the -n parameter (see here).
Unnamed samples are unsuitable for merging.
Correct usage
motus merge -i SAMEA5998847.mOTUs4 SAMEA6009611.mOTUs4 SAMEA6009792.mOTUs4 SAMEA6009843.mOTUs4 SAMEA6009888.mOTUs4 -o PRJEB52368.tsv listing profiles directly in the command line.
motus merge -i profiles_to_merge.txt -o PRJEB52368.tsv indicating file with a list of profiles, where the content of profiles_to_merge.txt is:
/path/to/SAMEA5998847.mOTUs4
/path/to/SAMEA6009611.mOTUs4
/path/to/SAMEA6009792.mOTUs4
/path/to/SAMEA6009843.mOTUs4
/path/to/SAMEA6009888.mOTUs4
Incorrect usage and mOTUs will abort with an error
motus merge -i SAMEA5998847.mOTUs4 -o PRJEB52368.tsv merge requires at least two profiles to be provided.
motus merge -i SAMEA5998847.mOTUs4 SAMEA6009611.mOTUs4 SAMEA6009611.mOTUs4 -o PRJEB52368.tsv the same sample is indicated twice.
motus classify#
Assigns provided genomes to a mOTU if the corresponding taxon is present within database.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The classify command takes a list of genome sequence files as input and
assigns these genomes to existing mOTUs in the database.
Usage:
motus classify -i FILE -o FILE [options]
Input options:
-i, --input-file FILE
Text file listing genome sequence files in FastA(.gz) format to classify.
One line per genome file [required]
Output options:
-o, --output-file FILE
Output file name. Each line contains a genome and its associated mOTU [required]
Algorithm options:
-t, --threads INT
Number of threads (default: 1)
Required arguments#
Input -i: a text file containing a list of paths to genome files in FastA(.gz) format.
Output -o: path to output file, which contains one line per genome with its associated mOTU.
Option overview#
Option |
Description |
|---|---|
|
Input file - Genome list: A text file listing genome files that will be associated with existing mOTUs. |
|
Output path - Classification: The output file, containing one line per genome with its associated mOTU. |
|
Threads: Number of threads to use for aligning against the mOTUs MG database. Default is |
Option description#
Input file#
Option |
Description |
|---|---|
|
Input file - Genome list: A text file listing genome files that will be associated with existing mOTUs. |
The input of motus classify is a text file listing genome sequence files in FastA format, one file per line.
The genome sequence files can be gzipped, and the filenames of all genomes must be unique.
Correct usage
Wherein the content of genomes txt is
$ cat genomes.txt
/a/b/c.fa
/a/c/d.fa.gz
Incorrect usage and mOTUs will abort with an error
$ cat genomes.txt
/a/b/c.fa /a/c/d.fa
$ cat genomes.txt
/a/b/c.fa
/a/c/c.fa
Output file#
Option |
Description |
|---|---|
|
Output path - Classification: The output file, containing one line per genome with its associated mOTU. |
The tabular output file contains one line per submitted genome, indicating the assigned mOTU, <6MGs-no_mOTU if the genome lacked at least 5 out of 10 marker genes, or Novel-no_mOTU if the genome had >6MGs marker genes but couldn’t be assigned to any mOTU.
Threads#
Option |
Description |
|---|---|
|
Threads: Number of threads to use for aligning against the mOTUs MG database. Default is |
motus classify is partly multi-threaded and using less than 32 threads usually gives a considerable speed-up when classifying larger genome sets.
motus prep_long#
Prepares long reads for profiling by mOTUs. Has to be run before the motus profile command.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The prep_long command takes long-read sequencing data and converts it
into the appropriate input format to be used by the profile and map_tax commands.
Usage:
motus prep_long -i FILE -o FILE [options]
Input options:
-i, --input-file FILE
Long-read sequencing file to convert, can be in FastQ/A(.gz) format [required]
Output options:
-o, --output-file FILE
Output file name. This converted file is ready to be used by motus profile [required]
Algorithm options:
-sl, --splitting-length INT
Target fragment length (in bp) for splitting long reads (default: 300)
-ml, --minimum-length INT
Minimum read length after splitting. Shorter reads are discarded (default: 50)
Required arguments#
Input -i: input file containing long reads in FastQ/A(.gz) format.
Output -o: output FastA file containing converted reads. Appropriate to be used as input for motus profile.
Option overview#
Option |
Description |
|---|---|
|
Input path: The input file containing long reads in FastQ/A(.gz) format. |
|
Output path: The output file containing converted short reads in FastA format. |
|
Splitting length: Target length of the converted reads. Default value is |
|
Minimum length: Converted reads shorter than indicated length will not be written to the output. The default value is |
Option description#
Splitting length and minimum length#
The motus prep_long function splits every long read in the dataset into non-overlapping fragments of 300 base pairs (or the value of -sl) in length:
|--SL--| |--- >ML?
|======|======|======|======|======|======|======|======|===
The final fragment is only written to the output file if its length is at least 50 base pairs (or the value of -ml). Fragments cannot overlap as this will affect base coverage quantification.
motus genomes#
Queries the mOTUs genome database to find genomes matching indicated mOTUs identifiers, taxonomic clades, or functional annotations.
Note
First-time execution of this command downloads the mOTUs annotation database, which requires 17.7G of storage.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The genomes command queries the mOTUs-db based on identifiers, functional,
or taxonomic annotations and returns a list of genomes matching indicated query.
Usage:
motus genomes -i FILE -o FILE [options]
motus genomes -i STR [STR ...] -o FILE [options]
Input options:
-i, --input-queries FILE/STR
Can be either a list of search queries or a text file listing search queries
with one line per query. Queries can be genome or mOTUs identifiers, PFAM, KEGG, EGGNOG,
or GTDB taxonomy names. If the query does not exactly match any database entry,
alternative queries will be suggested [required]
Output options:
-o, --output-file FILE
Output file containing a list of genome identifiers matching search queries and their
annotations as indicated by the -d parameter. This output file can be used as input
for the motus download command [required]
-d, --details STR [STR ...]
List of annotations to report. Choose any combination of [KEGG, PFAM, EGGNOG, TAXONOMY],
for example, -d KEGG PFAM.
Required arguments#
Input -i: a list of search queries separated by space. Alternatively, a text file listing search queries, with one query per line.
Output -o: output table containing genome identifiers matching search queries and annotations requested by the user. This file is appropriate to be used as input for motus download.
Option overview#
Option |
Description |
|---|---|
|
Input - List of queries: A list of terms to query within the mOTUs annotation database. |
|
Output file: The output table contains an overview of genomes matching indicated queries, accompanied by annotations specified with the |
|
Details: Annotations to report. Options include |
Option description#
List of queries#
Option |
Description |
|---|---|
|
Input - List of queries: A list of terms to query within the mOTUs annotation database. |
A list of search queries separated by space. Alternatively, a text file listing search queries, with one query per line.
Output file#
Option |
Description |
|---|---|
|
Output file: The output table contains an overview of genomes matching indicated queries, accompanied by annotations specified with the |
By default only the names of the genomes and the search query is reported. Using the -d option will add columns with functional and taxonomic annotation.
Details#
Option |
Description |
|---|---|
|
Details: Annotations to report. Options include |
The -d option allows users to specify which annotation to report when using the motus genomes command. Reporting multiple annotation types is possible, e.g. by -d KEGG PFAM.
motus download#
Downloads sequences for indicated genomes from the mOTUs genome database.
command line interface
Program: motus - a tool for marker gene-based OTU (mOTU) profiling
Version: 4.0.4
References:
Profiler: Ruscheweyh, Milanese et al. Cultivation-independent genomes greatly expand
taxonomic-profiling capabilities of mOTUs across various environments. Microbiome (2022).
doi: https://doi.org/10.1186/s40168-022-01410-z
Database: Dmitrijeva, Ruscheweyh et al. The mOTUs online database provides web-accessible
genomic context to taxonomic profiling of microbial communities. Nucleic Acids Research (2025).
doi: https://doi.org/10.1093/nar/gkae1004
Summary:
The download command downloads listed genome files from mOTUs-db.
Usage:
motus download -i FILE -o PATH [options]
motus download -i STR [STR ...] -o PATH [options]
Input options:
-i, --input-genomes FILE/STR
Can be either a list of genome identifiers separated by spaces or a text file
listing the identifiers of genomes for download. One line per genome. The output of
the motus genomes command can be used as input for this command [required]
Output options:
-o, --output-folder PATH
Path to output folder where the downloaded sequences will be saved [required]
-r, --representatives
Download only sequences from representative genomes.
Required arguments#
Input -i: a list of genome identifiers separated by space. Alternatively, a text file listing genome identifiers,
with one genome per line. The file generated by motus genomes can be used as input.
Output -o: path to folder that will store downloaded sequences.
Option overview#
Option |
Description |
|---|---|
|
Input - List of genomes: A list of genome identifiers specifying which sequences to download. |
|
Output path: Output directory in which the downloaded sequences will be saved. |
|
Representatives only: Only download sequences from representative genomes. |
Option description#
List of genome identifiers#
Option |
Description |
|---|---|
|
Input - List of genomes: A list of genome identifiers specifying which sequences to download. |
The download command supports two different types of input files. You can provide a simple list of genome names or a detailed table from a previous analysis.
Option 1: Simple Genome List
This is a basic text file where you provide one genome name per line. Use this format if you already have a specific list of genomes you want to download.
ELLE19-1_SAMN09288280_MAG_00000004
ELLE19-1_SAMN09288282_MAG_00000006
ELLE19-1_SAMN09288284_MAG_00000011
Option 2: mOTUs Genome Table
Alternatively, you can provide the output file generated by the motus genomes command. This format is a tab-separated table that includes a header line followed by the genome names and their associated query data.
GENOME QUERY
ELLE19-1_SAMN09288280_MAG_00000004 mOTUv4.0_001734
ELLE19-1_SAMN09288282_MAG_00000006 mOTUv4.0_001734
ELLE19-1_SAMN09288284_MAG_00000011 mOTUv4.0_001734
mOTUs is part of SIB's portfolio of open tools and databases.
mOTUs is part of the ELIXIR-CH Service Delivery Plan.