Reference sets configuration for microbes#1004
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/mvp-rapid #1004 +/- ##
==================================================
Coverage 59.07% 59.07%
==================================================
Files 213 213
Lines 22671 22671
Branches 3527 3527
==================================================
Hits 13394 13394
Misses 8150 8150
Partials 1127 1127 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adding separation space between different collection blocks Co-authored-by: twalsh-ebi <twalsh@ebi.ac.uk>
Adds missing closing characters Co-authored-by: twalsh-ebi <twalsh@ebi.ac.uk>
Deletes repeated closing collection block Co-authored-by: twalsh-ebi <twalsh@ebi.ac.uk>
|
Thank you very much for flagging these typos Thomas. If everything else is good and Travis passes all checks I'm happy to go ahead with the merge. |
removed diff comments
According to the RNG rules a collection must contain a defined either: <taxonomic_group .../> OR <genome .../> since all genomes are defined in the base_collection, I added dummy taxonomic_groups
| <base_collection name="shared_protists"/> | ||
| <taxonomic_group taxon_name="SAR"/> |
There was a problem hiding this comment.
Hi @manuelcarbajo .. Would it be possible to remove the Protists taxonomic_group elements, and replace each of the base_collection elements with a composable_collection, as I've suggested in this case?
| <base_collection name="shared_protists"/> | |
| <taxonomic_group taxon_name="SAR"/> | |
| <composable_collection name="shared_protists"/> |
There was a problem hiding this comment.
Thanks @twalsh-ebi . If Travis is happy with this configuration, it works for me.
The taxonomic_group elements were originally added to satisfy the RNG validation requirements for the collections definition, so I’m glad to see a cleaner approach using composable_collection instead.
Updated microbial reference genomes for the upcoming release. This update includes Archaea, Bacteria, Fungi, and Protists.
Note: As “Protists” is no longer a formal or searchable taxonomic group in NCBI Taxonomy, a base_collection named "shared_protists" has been introduced. This base collection aggregates all genomes used for comparative analyses across protist lineages.
Nine protist collections names have been defined: sar, amoebozoa, excavata, discoba, opisthokonta, haptophyta, viridiplantae, rhodophyta, and cryptophyceae. These collections correspond to higher-level eukaryotic clades that are explicitly represented and queryable in NCBI Taxonomy and together provide comprehensive coverage of protist diversity with genome-scale representation. These groups were selected using as criteria NCBI Taxonomy Browser (defining which higher-level eukaryotic clades are explicitly represented and queryable) and Adl SM et al. (2019) "Revisions to the classification, nomenclature, and diversity of eukaryotes." Journal of Eukaryotic Microbiology 66:4–119