Warning: v3 is currently under heavy testing and has not been officially released. For production use, install the stable v2 release:
pip install eggnog-mapper==2.1.15(see v2 branch).
eggNOG-mapper is a tool for fast functional annotation of novel sequences using precomputed orthologous groups and phylogenies from the eggNOG database. Functional information is transferred exclusively from fine-grained orthologs, yielding higher precision than homology-based approaches (e.g. BLAST) by avoiding annotation transfer from close paralogs.
Common uses include annotation of novel genomes, transcriptomes, and metagenomic gene catalogs.
eggNOG-mapper is also available as a public web server: http://mapper.eggnogdb.org
v3 is a major release targeting the eggNOG v7 database and a completely redesigned annotation engine.
- eggNOG v7 database with integer-encoded orthology, phylogeny-aware speciation events, and ~12M proteins across ~10k taxa. eggNOG v5 databases are no longer supported.
- Curated-only functional donors: only manually curated functional terms (from SwissProt and equivalent curated sources) are used as annotation donors. This stops the propagation of misannotations inherited from automated pipelines. Despite the stricter source requirements, v3 achieves better annotation coverage than v2.
- Per-seed taxonomic ceiling replaces the old
--tax_scopepredefined scope lists. Each query seed gets its ownev_lca-based ceiling automatically narrowed to the most informative phylogenetic level (--tax_scope auto, default). Fixed clades (Metazoa,33208, etc.) are still accepted. - Cascade annotation engine: for each functional source (GO, KEGG, Pfam, EC, ...) donors are walked from closest and best-typed first, with the seed's own curated annotation as the strongest tier-0 donor.
- No bundled binaries — DIAMOND, HMMER, MMseqs2, and Prodigal must be installed externally (see Requirements below). The wheel shrinks from ~150 MB to ~5 MB and cross-platform installs (macOS, Windows) now work.
- Compressed input — gzip and bzip2 FASTA inputs are autodetected by magic bytes.
- Parallel annotation —
--cpu Nparallelises both search and annotation. - Cython-accelerated inner loops —
_codecand_collect_innerextensions give ~2–3× speedup on the annotation phase. --resume— safely resumes an interrupted run, reusing the existing hits file.- Apptainer/Singularity image — a self-contained HPC image is provided via
apptainer/build.sh.
- Python ≥ 3.9
- At least one search backend:
| Tool | Install |
|---|---|
| DIAMOND | conda install -c bioconda diamond |
| HMMER | conda install -c bioconda hmmer |
| MMseqs2 | conda install -c bioconda mmseqs2 |
| Prodigal | conda install -c bioconda prodigal (gene prediction only) |
pip install eggnog-mapperOr from source:
git clone https://github.com/eggnogdb/eggnog-mapper.git
cd eggnog-mapper
pip install .download_eggnog_data.py --data_dir /path/to/eggnog-data# Protein sequences against eggNOG v7 using DIAMOND
emapper.py -m diamond -i proteins.fa --itype proteins \
--data_dir /path/to/eggnog-data \
-o my_annotation --output_dir results/ --cpu 20
# Two-step: search first, annotate later
emapper.py -m diamond -i proteins.fa --itype proteins \
--data_dir /path/to/eggnog-data \
-o my_annotation --output_dir results/ --no_annot --cpu 20
emapper.py -m no_search --annotate_hits_table results/my_annotation.emapper.seed_orthologs \
--data_dir /path/to/eggnog-data \
-o my_annotation --output_dir results/https://github.com/eggnogdb/eggnog-mapper/wiki
If you use eggNOG-mapper, please cite:
[1] eggNOG-mapper v2: functional annotation, orthology assignments, and domain
prediction at the metagenomic scale. Carlos P. Cantalapiedra,
Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021.
Molecular Biology and Evolution, msab293, https://doi.org/10.1093/molbev/msab293
[2] eggNOG v7: phylogeny-based orthology predictions and functional annotations.
Ana Hernández-Plaza, Ziqi Deng, Fabian Robledo-Yagüe, Damian Szklarczyk,
Christian von Mering, Peer Bork, Jaime Huerta-Cepas. Nucleic Acids Research,
Volume 54, Issue D1, 6 January 2026, Pages D402-D408.
https://doi.org/10.1093/nar/gkaf1249
Please also cite the search tool used:
[DIAMOND] Sensitive protein alignments at tree-of-life scale using DIAMOND.
Buchfink B, Reuter K, Drost HG. 2021.
Nature Methods 18, 366–368. https://doi.org/10.1038/s41592-021-01101-x
[HMMER] Accelerated Profile HMM Searches.
Eddy SR. 2011. PLoS Comput. Biol. 7:e1002195.
[MMSEQS2] MMseqs2 enables sensitive protein sequence searching for the analysis
of massive data sets. Steinegger M & Söding J. 2017.
Nat. Biotech. 35, 1026–1028. https://doi.org/10.1038/nbt.3988
[PRODIGAL] Prodigal: prokaryotic gene recognition and translation initiation
site identification. Hyatt et al. 2010.
BMC Bioinformatics 11, 119. https://doi.org/10.1186/1471-2105-11-119
If you are working with eggNOG v5 databases, use the v2 branch or install the last v2 release from PyPI:
pip install eggnog-mapper==2.1.15v2 and v3 databases are not interchangeable. v3 only works with eggNOG v7.