mOTUs Database#
The mOTUs-db is an open-access collection of prokaryotic genomes designed to help researchers map the vast and often hidden diversity of microbial life. By combining millions of de novo reconstructed genomes with established reference databases, we aim to provide a clearer window into the “biological dark matter” found in underexplored environments.
A Community-Focused Genome Collection#
This resource is built upon a collection of 3.90 million prokaryotic genomes, curated from two main sources:
2.96 Million Metagenome-Assembled Genomes (MAGs): These were recovered from over 120,000 metagenomes across >50 different environments—ranging from soil and deep-sea vents to various non-human hosts. We used a standardized workflow to ensure consistent data quality across these diverse samples.
919,090 Genomes: These are integrated from existing public repositories to provide a reliable taxonomic foundation for well-characterized species.
Features and Data Accessibility#
Expanding Known Diversity: The database represents 124,295 species-level taxonomic units (mOTUs). Our analysis suggests that over 50% of these groups are currently unique to mOTUs-db and are not yet represented in other major repositories like the Genome Taxonomy Database (GTDB).
Transparent Quality Metrics: Each genome in the database is accompanied by completeness and contamination scores, allowing users to filter and select the data that best fits their specific research needs.
Ecological Context: Where possible, genomes are linked to their original environment helping researchers explore how different microbes relate to their environments.
Exploring the Data#
The mOTUs-db online interface is intended to be a user-friendly starting point for exploring this data, offering:
Simple Search & Filter: Navigate through the 3.9 million genomes by taxonomy, mOTU identifier, or basic quality metrics.
Access to Genomic Sequences: Link taxonomic identifiers directly to genomic sequences and their functional annotations.
Data Retrieval: A straightforward portal for downloading specific genome sets or environmental collections for further study.
Abundances: Access to the taxonomic profiles of 117k metagenomic samples.
Functional Annotation: Genes from all genomes were annotated with KEGG, PFAM and EGGNOG.
Gene Catalogs: Genes from all genomes were clustered at different thresholds and in protein/nucleotide space.
mOTUs is part of SIB's portfolio of open tools and databases.
mOTUs is part of the ELIXIR-CH Service Delivery Plan.