GenomeDepot

About

GenomeDepot is an open-source web-based platform for annotation, management, and comparative analysis of microbial genomic data. With GenomeDepot, you can create web-sites for your own genome collections. These web-sites have tools for interactive genome browsing, BLAST search, annotation search, comparative genomic neighborhood visualization, and sequence download.

GenomeDepot Copyright (c) 2025, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy). All rights reserved.

If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Intellectual Property Office at IPO@lbl.gov.

NOTICE. This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit others to do so.

License Agreement

GPL v3 License

GenomeDepot Copyright (c) 2025, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy). All rights reserved.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Prerequisites

Linux-based OS with installed Python 3.8+ and conda (miniconda, anaconda etc.).
MySQL server
Apache2 web-server with mod_wsgi
Muscle
NCBI-BLAST (blastp and megablast required)
Perl
zlib1g-dev package
curl
pkg-config

In Ubuntu-based distibutions, you can install the dependencies using the APT package repository: sudo apt install apache2 mysql-server muscle ncbi-blast+-legacy build-essential zlib1g-dev libexpat1-dev curl pkg-config

Installation

Create a directory where GenomeDepot and external tools will be installed (for example, ~/genomedepot). Create "app" subdirectory. Do it only for the first GenomeDepot installation.

cd ~

mkdir genomedepot

cd genomedepot

mkdir apps

cd apps

Create a directory for the new GenomeDepot installation in genomedepot/apps (for example, mygenomes) and clone the repository into it.

mkdir mygenomes

cd mygenomes

git clone https://github.com/aekazakov/genome-depot

Install external tools and create virtual environment.

cd genomedepot

bash install.sh

Create mysql user (for example, gduser) or use existing mysql account.
Create database.

log into mysql as root: mysql -u root -p

CREATE DATABASE genomedepot CHARACTER SET utf8;

GRANT ALL PRIVILEGES ON genomedepot.* TO 'gduser'@'localhost';

quit

Open genomedepot/apps/mygenomes/genome-depot/genomebrowser/secrets.json in a text editor. Enter secret key, hostname, database name (like genomedepot), database username (like gduser), database password and URL to static files directory.
Open genomedepot/apps/mygenomes/genome-depot/genomebrowser/static/jbrowse/jbrowse.conf in a text editor. Find documentDomain parameter, uncomment it and enter hostname.
Open genomedepot/apps/mygenomes/genome-depot/genomebrowser/configs.txt in a text editor. Check:

paths to Jbrowse utilites: prepare-refseqs.pl, flatfile-to-json.pl, generate-names.pl,

path to strain metadata file.

Activate virtual environment genomedepot-venv, change directory to genomedepot/app/mygenomes/genome-depot/genomebrowser and run Django configuration commands

source genomedepot/genomedepot-venv/bin/activate

cd genomedepot/app/mygenomes/genome-depot/genomebrowser

python manage.py collectstatic

python manage.py makemigrations

python manage.py migrate

python manage.py createsuperuser (Enter your desired username, email and password).

python manage.py import_config -i configs.txt

python manage.py createcachetable

python manage.py update_taxonomy

Run python manage.py runserver 127.0.0.1:8000 and open in the browser http://127.0.0.1:8000/admin. You should be able to log in with the username you entered at the previous step.

If test server cannot find static files, check if www-data user can read from the genomedepot/static/my_genomes directory (giving rw permissions for www-data group would work).

Configure web-server for GenomeDepot. Open apache2 site configuration file (it may be default_ssl.conf) in a text editor and add the following (with correct paths):

WSGIDaemonProcess genomedepotpy python-home=/path/to/genomedepot/genomedepot-venv python-path=/path/to/genomedepot/app/mygenomes/genome-depot/genomebrowser

WSGIScriptAlias /mygenomes /path/to/genomedepot/app/mygenomes/genome-depot/genomebrowser/genomebrowser/wsgi.py process-group=genomedepotpy application-group=%{GLOBAL}

<Directory /path/to/genomedepot/app/mygenomes/genome-depot/genomebrowser/genomebrowser/>
```
<Files wsgi.py>

    Require all granted

</Files>
```
<Directory /path/to/genomedepot/static/>
```
Options -Indexes +FollowSymLinks

<IfModule mod_headers.c>

  Header set Access-Control-Allow-Origin http://127.0.0.1:8000

</IfModule>

Require all granted
```
Alias /genomedepotstatic /path/to/genomedepot/static/

You may have to add "Header always set X-Frame-Options "SAMEORIGIN"" to web server configuration if the embedded genome viewer is not properly displayed.

Restart apache2:

sudo systemctl restart apache2

Now you would be able to open https://your.domain.name/mygenomes in a web browser.

Note: If POEM fails to predict operons, and run_poem.sh script throws an error "AttributeError: module 'tensorflow' has no attribute 'get_default_graph'", it means the versions of keras and tensorflow are not compatible. Activate conda genomedepot-poem environment and run "pip install tensorflow==1.13.1".

Genome import from the command line

Download genomes in genbank format (files may be gzipped). Make a tab-separated file (for example, genomes.txt) with six columns:

path to Genbank file
genome ID (no spaces)
strain name (no spaces)
sample ID (no spaces)
URL (link to NCBI genome assembly etc.)
External ID (someting like "NCBI:GCF_000006945.2")

Import genomes into database. If genome files have been uploaded to the server, they can be imported from the command line:

activate virtual environment (source /path/to/virtualenv/bin/activate)

cd genomedepot/apps/mygenomes/genome-depot/genomebrowser

python manage.py import_genomes -i genomes.txt

Input file Depending on the number of genomes, this command may run from several hours to several days. After that, you should see the genomes on the web site.

In the process of genome import, GenomeDepot annotation pipeline runs eggnog-mapper to generate EggNOG, KEGG, GO, EC, TC, CAZy and COG mappings for all proteins annotated in the input file. The pipeline predicts operons with POEM, maps Pfam domains with hmmsearch and generates functional annotations with several annotation tools.

Genome import from admin panel

Alternatively, genomes can be imported from admin panel.

Start qcluster process:

activate virtual environment (source /path/to/virtualenv/bin/activate)

cd genomedepot/apps/mygenomes/genome-depot/genomebrowser

python manage.py qcluster

Log into siteURL/admin with superuser login and password you created during installation process.

Click "Import genomes" button and follow the instructions on the page.

Tab-separated text file is always required.

Zip-archive with genomes may be provided, if genomes haven't been uploaded to server.

For genome download from NCBI ftp, e-mail address should be provided. For security reasons, this e-mail is never stored in the database.

Other commands

Note: activate virtual environment (source /path/to/virtualenv/bin/activate) before running any command, then change directory to genomedepot/apps/mygenomes/genome-depot/genomebrowser

Command to re-create Pfam and TIGRFAM domain mappings:

python manage.py update_domain_mappings -i genomes.txt

Command to generate functional annotations (if annotation pipeline failed or you added a new tool):

python manage.py update_annotations -i genomes.txt

Command to export genomes with GenomeDepot annotations in genbank format

python manage.py export_genomes -g -o

Command to import regulon data

python manage.py add_regulons -i

The input file for this command is a tab-separated text file with nine columns:

regulon name
genome name
locus tag of regulator
locus tag of target gene
contig id
site start
site end
site strand (1 or -1)
site sequence

Import test dataset of RefSeq genomes

You can test GenomDepot installation by importing a dataset of 25 RefSeq genome assemblies.

In the GenomeDepot administration panel, go to the Genome Import page, choose option 3 (import from NCBI). Click "Browse" and choose the testdata/demo_25genomes_import.txt file. Enter your email and click "Start import".

Image Credits

All images courtesy of Unsplash (https://unsplash.com).

Jonathan Pie https://unsplash.com/@r3dmax?photo=3l3RwQdHRHg https://unsplash.com/@r3dmax?photo=1hpE3fROU0I https://unsplash.com/@r3dmax?photo=3N5ccOE3wGg https://unsplash.com/@r3dmax?photo=-3h8OXvt4-0 https://unsplash.com/@r3dmax?photo=7FfG8zcPcXU https://unsplash.com/@r3dmax?photo=EvKBHBGgaUo https://unsplash.com/@r3dmax?photo=8I49k45G-3A https://unsplash.com/@r3dmax?photo=iokiwAq05UU

Name		Name	Last commit message	Last commit date
Latest commit History 319 Commits
genomebrowser		genomebrowser
ref_data		ref_data
testdata		testdata
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Legal.txt		Legal.txt
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenomeDepot

About

License Agreement

Prerequisites

Installation

Genome import from the command line

Genome import from admin panel

Other commands

Import test dataset of RefSeq genomes

Image Credits

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GenomeDepot

About

License Agreement

Prerequisites

Installation

Genome import from the command line

Genome import from admin panel

Other commands

Import test dataset of RefSeq genomes

Image Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages