Skip to content

404 Error while download GTDB r226 #145

@gracegyho

Description

@gracegyho

I ran the download command to get the latest GTDB files, but it ended with an HTTP 404 error.

Here is the output and error message:

$ CAT_pack download --db GTDB -o CAT/ --cleanup

[2025-07-02 13:42:01] CAT_pack will download files from GTDB v226 ().
[2025-07-02 13:42:01] Downloading VERSION.txt.
[2025-07-02 13:42:02] Downloading ar53_taxonomy.tsv.gz.
[2025-07-02 13:42:04] Downloading bac120_taxonomy.tsv.gz.
[2025-07-02 13:45:07] Downloading MD5SUM.txt.
[2025-07-02 13:45:08] Failed downloading file: https://data.gtdb.ecogenomic.org/releases/latest/MD5SUM.txt.
Traceback (most recent call last):
File "/home/gho/miniconda3/envs/cat/bin/CAT_pack", line 101, in
main()
~~~~^^
File "/home/gho/miniconda3/envs/cat/bin/CAT_pack", line 77, in main
download.run()
~~~~~~~~~~~~^^
File "/home/gho/miniconda3/envs/cat/share/cat-6.0.1-1/CAT_pack/download.py", line 764, in run
process_gtdb(args.output_dir, args.log_file, args.quiet, args.cleanup)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/share/cat-6.0.1-1/CAT_pack/download.py", line 579, in process_gtdb
multi_download(gtdb_urls, output_dir, log_file, quiet, prefix=None)
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/share/cat-6.0.1-1/CAT_pack/download.py", line 100, in multi_download
download_singleton(url, output_path, log_file, quiet)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/share/cat-6.0.1-1/CAT_pack/download.py", line 67, in download_singleton
urllib.request.urlretrieve(target_url, local_path)
~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 214, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
~~~~~~~^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 189, in urlopen
return opener.open(url, data, timeout)
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 495, in open
response = meth(req, response)
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 604, in http_response
response = self.parent.error(
'http', request, response, code, msg, hdrs)
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 533, in error
return self._call_chain(*args)
~~~~~~~~~~~~~~~~^^^^^^^
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 466, in _call_chain
result = func(*args)
File "/home/gho/miniconda3/envs/cat/lib/python3.13/urllib/request.py", line 613, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

..Alternatively, is there any way to run the the file processing commands on a copy of the GTDB database which has already been downloaded on my hard drive?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions