Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft Xenopus laevis (African clawed frog) and Xenopus tropicalis (tropical clawed frog) #1019

Open
brianraymor opened this issue Sep 25, 2024 · 5 comments
Assignees
Labels
experimental approved schema as experimental multispecies discovery Adding new species to CELLxGENE schema CELLxGENE Discover dataset schema

Comments

@brianraymor
Copy link
Contributor

brianraymor commented Sep 25, 2024

Context

To reduce Q1 scope, Xenopus laevis and Xenopus tropicalis will move to Q2.

Changelog

Appendix A. Changelog

  • Required Ontologies
    • Added Xenopus Anatomy Ontology (XAO) release 2024-09-03 XAO 11.1
  • Required Gene Annotations
    • Added Xenopus laevis Xenopus_laevis_v10.1 (GCA_017654675.1) Ensembl Beta
    • Added Xenopus tropicalis UCB_Xtro_10.0 (GCA_000004195.4) Ensembl 113
  • obs (Cell metadata)
    • Updated the ontology requirements for cell_type_ontology_term_id to include:
      • XAO for Xenopus laevis and Xenopus tropicalis
    • Updated the ontology requirements for development_stage_ontology_term_id to include:
      • XAO for Xenopus laevis and Xenopus tropicalis
    • Updated the ontology requirements for tissue_ontology_term_id to include:
      • XAO for Xenopus laevis and Xenopus tropicalis
  • var and raw.var (Gene Metadata)
    • Updated feature_reference to include:
      • Xenopus laevis
      • Xenopus tropicalis

Design

Required Ontologies

Ontology OBO Prefix Releases OLS
Xenopus Anatomy Ontology XAO 2024-09-03 XAO 11.1 xenopus_anatomy.owl

Editorial Notes

This ontology is under active development. CELLxGENE pins ontology releases in each version of the schema. A specific release of the ontology above must be selected in the future.


Required Gene Annotations

Organism Source Required version Download
NCBITaxon:8355
for Xenopus laevis
ENSEMBL Xenopus_laevis_v10.1
(GCA_017654675.1)
Ensembl Beta
genes.gtf
NCBITaxon:8364
for Xenopus tropicalis
ENSEMBL UCB_Xtro_10.0
(GCA_000004195.4)
Ensembl 113
Xenopus_tropicalis.UCB_Xtro_10.0.113.gtf

Editorial Notes

There is no available Ensembl reference except for Xenopus laevis , except as a Beta release.

BGEE references a suppressed RefSeq version.

The alternatives for selecting a non-Ensembl genome reference are:

See the current NCBI reference

See Xenbase Genome Downloads


obs (Cell Metadata)

cell_type_ontology_term_id

Key cell_type_ontology_term_id
Annotator Curator MUST annotate.
Value categorical with str categories. This MUST be "unknown" when:
  • no appropriate term can be found (e.g. the cell type is unknown)
  • assay_ontology_term_id is "EFO:0010961" for Visium Spatial Gene Expression,
    uns['spatial']['is_single'] is True,
    and the corresponding value of in_tissue is 0

The following CL terms MUST NOT be used:
For organism_ontolology_term_id Value
"NCBITaxon:8355"
for Xenopus laevis or
"NCBITaxon:8364"
for Xenopus tropicalis
MUST be either a CL term or the most accurate descendant of XAO:0003012
for cell excluding XAO:0004290 for cell part and its descendants

development_stage_ontology_term_id

Key development_stage_ontology_term_id
Annotator Curator MUST annotate.
Value categorical with str categories. If unavailable, this MUST be "unknown".

For organism_ontolology_term_id Value
"NCBITaxon:8355"
for Xenopus laevis or
"NCBITaxon:8364"
for Xenopus tropicalis
MUST be the most accurate descendant of XAO:1000000
for Xenopus developmental stage excluding XAO:0000437 for death,
XAO:10000817 for unspecified stage, and XAO:1000094 for NF stage
and its descendants

organism_ontology_term_id

Key organism_ontology_term_id
Annotator Curator MUST annotate.
Value str. One of the following terms MUST be used:

For Organism MUST Use
Xenopus laevis "NCBITaxon:8355"
Xenopus tropicalis "NCBITaxon:8364"

tissue_ontology_term_id

Key tissue_ontology_term_id
Annotator Curator MUST annotate.
Value categorical with str categories. If tissue_type is "cell culture" this MUST follow the requirements for cell_type_ontology_term_id.

If tissue_type is "tissue" or "organoid" then:

For organism_ontolology_term_id Value
"NCBITaxon:8355"
for Xenopus laevis or
"NCBITaxon:8364"
for Xenopus tropicalis
MUST be either an UBERON term or the most accurate descendant of
XAO:0000000 for Xenopus anatomical entity excluding XAO:0003003 for
unspecified, XAO:0003005 for female organism, XAO:0003006 for male organism,
and XAO:0003012 for cell and its descendants

var

feature_reference

Key feature_reference
Annotator CELLxGENE Discover MUST annotate.
Value str. This MUST be the reference organism for a feature:

Reference Organism MUST Use
Xenopus laevis "NCBITaxon:8355"
Xenopus tropicalis "NCBITaxon:8364"

References

Xenbase - The Xenopus model organism knowledgebase

BGEE - Xenopus laevis
BGEE - Xenopus tropicalis

@brianraymor brianraymor added schema CELLxGENE Discover dataset schema multispecies discovery Adding new species to CELLxGENE labels Sep 25, 2024
@brianraymor brianraymor changed the title Draft Xenopus laevis and Xenopus tropicalis Draft Xenopus laevis (African clawed frog) and Xenopus tropicalis (tropical clawed frog) Sep 26, 2024
@brianraymor brianraymor self-assigned this Sep 30, 2024
@brianraymor brianraymor added the in review reviewing schema for species label Nov 5, 2024
@SESDNA
Copy link

SESDNA commented Nov 6, 2024

@bnelson-czi is curating this dataset [XCL | Xenopus Cell Landscape] https://bis.zju.edu.cn/XCL/
It uses this reference genome: https://www.xenbase.org/xenbase/doNewsRead.do?id=198

@brianraymor
Copy link
Contributor Author

I'd note that there is a more recent (2021) release - https://www.xenbase.org/xenbase/doNewsRead.do?id=827 - which is the version available in Ensembl rapid release.

@SESDNA
Copy link

SESDNA commented Nov 8, 2024

@brianraymor I recommend using the Ensembl rapid release genome for X. laevis because as you pointed out it is the same as the one on Xenbase and it has Ensembl IDs. This satisfies our needs. (https://rapid.ensembl.org/Xenopus_laevis_GCA_017654675.1/Info/Index)

@brianraymor brianraymor added experimental approved schema as experimental and removed in review reviewing schema for species labels Nov 20, 2024
@brianraymor
Copy link
Contributor Author

Added to schema 5.3.0.

@jahilton
Copy link
Collaborator

jahilton commented Feb 7, 2025

Main Qs I have right now...

  • Feature standards - need to confirm the following:
    • The most appropriate gene annotation references will be GCF_000004195.4 (X. tropicalis) & GCF_017654675.1 (X. laevis).
    • The stable identifiers to require in the var.index will be the db_xref attributes in those gtf files (e.g. GeneID:101731805, GeneID:2642076)
    • The feature_name values should be the corresponding gene_id attributes from the gtf files (e.g. pi16 for GeneID:101731805, ND4L for GeneID:2642076)
  • cell/obs metadata
    • are the cell_type_ontology_term_id, development_stage_ontology_term_id, tissue_ontology_term_id XAO branches/restrictions appropriate - any additional terms to block or allow?
    • donor_id - how are single cell data typically sampled? is it 1 individual == 1 sample, are multiple individuals pooled into 1 sample?
    • is there additional information not captured that should be required to make the submissions more reusable?
  • species/subspecies
    • NCBITaxon lists X. l. laevis & X. l. sudanensis underneath X. laevis (NCBITaxon:8355). Do researchers know/track which they are using?
    • There are many taxonomic groups listed under Xenopus (NCBITaxon:8353). Is it common to study the species other than X. tropicalis/laevis?

@ejmolinelli ejmolinelli pinned this issue Feb 10, 2025
@ejmolinelli ejmolinelli unpinned this issue Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental approved schema as experimental multispecies discovery Adding new species to CELLxGENE schema CELLxGENE Discover dataset schema
Projects
None yet
Development

No branches or pull requests

3 participants