Skip to content

Conversation

@AstrobioMike
Copy link

@AstrobioMike AstrobioMike commented Nov 3, 2022

GTDB changed from 122 archaeal genes to 53 relatively recently. these numbers don't regularly change, so it's not like this will happen with every new GTDB release

also made a change regarding MD5 checking:

  • It seems gtdb has base filenames in the "latest" section (https://data.gtdb.ecogenomic.org/releases/latest/), that look like this: "bac120_taxonomy.tsv.gz". But the names in the MD5 file from that same location have the version included, e.g. "bac120_taxonomy_r207.tsv.gz". This was causing a key error in the md5 check of the download.py module. So added a workaround to address this

GTDB changed from 122 archaeal genes to 53 relatively recently. these numbers don't regularly change, so it's not like this will happen with every new GTDB release
@AstrobioMike AstrobioMike marked this pull request as draft November 4, 2022 03:58
@AstrobioMike AstrobioMike marked this pull request as ready for review November 4, 2022 18:49
…ifies the message output of this to match those changes
@AstrobioMike
Copy link
Author

pinging @bastiaanvonmeijenfeldt as it would be helpful to have this integrated for workflows that use conda installs :)

@kdm9 kdm9 mentioned this pull request Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant