Global Microbial Gene Catalog (GMGC)
Version: | 1.0 |
---|---|
Citation: | Coelho, L.P., et al. Towards the biogeography of prokaryotic genes. Nature 601, 252–256 (2022). |
Contacts: | and |
For support with tools and resources: | https://groups.google.com/forum/#!forum/gmgc-users |
Note that non-redundant sub-catalogs are not built independently, but rather all genes from all habitats are first clustered together and unigenes (representative genes) are chosen. Unigenes are assigned to all the habitats present in the cluster they represent. This can result in an unigene being assigned to multiple habitats. Since our clustering is non-exclusive, a gene can belong to multiple clusters (this is particularly likely if the gene is a short fragment and we consider only 100% non-redundancy), and all those unigenes will be considered as representative. For some habitats, this resulted in more non-redundant genes being assigned to it than were originally assembled.