-
Notifications
You must be signed in to change notification settings - Fork 1
Morphology model #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Morphology model #282
Conversation
app/db/model.py
Outdated
pipeline_state: Mapped[PipelineType | None] = mapped_column( | ||
Enum(PipelineType, name="pipeline_type"), nullable=True | ||
) | ||
is_related_to: Mapped[list[uuid.UUID] | None] = mapped_column( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a relationship, not an array of ids because there will be no referential integrity if these ids are deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, perhaps this could be handled through the is_derived_from source and target table we discussed previously, in which case the is_related_field could be removed. But even if that is done, deletion of an id means that it would still need to be removed from the relationship table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand. I read the document but I don't know what the relationship is . Is it a derivation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdcourcol this was preserved from the current implementation (we discussed in one of the first meetings). In my understanding it is the relationship of raw morphology - linked to a repaired - linked to a clone etc.
app/db/model.py
Outdated
ARRAY(UUID(as_uuid=True)), nullable=True | ||
) | ||
score_dict: Mapped[dict | None] = mapped_column(JSONB, nullable=True) | ||
provenance: Mapped[dict | None] = mapped_column(JSONB, nullable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provenance should also be a relationship. We shouldn't be able to add aribtrary stuff in here. How is it planned to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might actually be difficult to structure as a formal relationship but perhaps @lidakanari has an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the configuration how the morphology has been reconstructed ? Usually we use the regular provenance pattern.
For instance, you have an entity for the "synthesis" configuration with an asset w/ all the parameters, and a derivation relationship between the 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for synthesized morphologies? This seems to be general here. Perhaps it is the provenance, such as published in another dataset, DOI, or paper? @dkeller9 could you please clarify here?
For synthesized I agree with @jdcourcol but here it seems general and I see another one for synthesized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the general schema document: "[required] provenance (dict): synthesis inputs e.g., parameters, distributions, amorphology etc. This is from Eleftherios and required to reproduce. We should ask him what kind of object it is." @eleftherioszisis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, in circuits, provenance is papers and related artifacts (see below):

I had the impression that we want to have something similar for morphologies. For experimental morphologies that would be related to papers, or raw data if the cell has been produced by another neuron morphology. For synthesized morphologies, that would be papers, inputs etc. Is this what we need for provenance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aspects of provenance are already handled by the publication linkage table and the activity table, in the case of derived morphologies. I think what was originally meant by "provenance" as noted in the General Schema was the synthesis parameters. But apparently these are also handled in a separate table.
app/schemas/morphology.py
Outdated
""" | ||
|
||
protocol_design: str | ||
staining_method: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be constrained with a StrEnum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidakanari what staining methods should we give as options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be Enum. I will compile a list of options. I am afraid it may be restricted to the staining I am aware of, so we should have a strategy to expand this in the future @eleftherioszisis
app/schemas/morphology.py
Outdated
Whether data has been corrected for shrinkage | ||
""" | ||
|
||
protocol_design: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be constrained with a StrEnum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidakanari what items should be in the protocol_design StrEnum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the protocol design is added when we have a respective document for this. We only have a full protocol document for LNMC, and the rest are from papers. There was a list shared with @jdcourcol last year, but I'm not sure what is best to do here, should we register protocols, link to existing papers?
|
||
# Attributes from ComputationallySynthesized | ||
|
||
provenance: Mapped[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back to this, I don't understand what is that string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the parameters that go into the synthesis algorithm from @eleftherioszisis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get how one string can be a list of parameter values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need feedback from @eleftherioszisis on what this should be. Probably a JSON to represent a dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then I don't understand why this is not the input of an activity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If indeed it is tracked in activity, then we can remove it. Can you confirm @eleftherioszisis ?
app/db/model.py
Outdated
class ComputationallySynthesizedMorphologyMethod(MorphologyProtocol): | ||
__tablename__ = "computationally_synthesized_morphology_protocol" | ||
id: Mapped[uuid.UUID] = mapped_column(ForeignKey("morphology_protocol.id"), primary_key=True) | ||
method: Mapped[str] # This 'method' might need further typing based on your vocabulary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not "name" ? and why not a property on the base "MorphologyProtocol" class ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that this is particular to the SynthesizedMorphology class. But perhaps it could be changed to name, if @lidakanari thinks it is okay.
app/db/model.py
Outdated
class ModifiedMorphologyMethod(MorphologyProtocol): | ||
__tablename__ = "modified_morphology_protocol" | ||
id: Mapped[uuid.UUID] = mapped_column(ForeignKey("morphology_protocol.id"), primary_key=True) | ||
method: Mapped[MethodsType] # Assuming MethodsType is an Enum or similar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above it is a string. Here is it a enum ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is polymorphic because the methods can vary considerably.
app/db/model.py
Outdated
has_been_corrected_for_shrinkage: Mapped[bool | None] | ||
|
||
__mapper_args__ = { | ||
"polymorphic_identity": "experimental", # A string identifier for this type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"polymorphic_identity": "experimental" -> "polymorphic_identity": tablename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this identifier be user-defined as has been done here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add "polymorphic_on": "type" which is the intended column to be used here for sqlalchemy to decide which hierarchy of tables to update when an operation is made.
In addition, in MorphologyProtocol the type must be constrained by a MorphologyProtocolType StreEnum the values of which should be the children tables, e.g.
class MorphologyProtocolType(StrEnum):
experimental_morphology_protocol = auto()
and then in MorphologyProtocol type: Mapped[MorphologyProtocolType]
, and in the children tables:
class ExperimentalMorphologyProtocol(MorphologyProtocol):
__tablename__ = MorphologyProtocolType.experimental_morphology_protocol.value
....
__mapper_args__ = {
"polymorphic_on": "type"
"polymorphic_identity": __tablename__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed in the latest push.
__tablename__ = "morphology_protocol" | ||
protocol_document: Mapped[str | None] | ||
protocol_design: Mapped[str] | ||
type: Mapped[str] # Discriminator column for polymorphism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need that ? @GianlucaFicarelli ?
|
||
class MorphologyProtocol(Identifiable): # Inherit from Identifiable for primary key and timestamps | ||
__tablename__ = "morphology_protocol" | ||
protocol_document: Mapped[str | None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what that is ? a URL ? a DOI ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is the url pointing to the protocol document. But perhaps we should upload the document into the system and have this point to it? What do you think?
class MorphologyProtocol(Identifiable): # Inherit from Identifiable for primary key and timestamps | ||
__tablename__ = "morphology_protocol" | ||
protocol_document: Mapped[str | None] | ||
protocol_design: Mapped[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here ?
This branch includes a minimal set of commits to illustrate the data model, and is branched off of a more recent main. The primary files affected are: model.py, types.py, and morphology.py (although this is intended just to give context to the data model). "ReconstructionCellMorphology" is changed to "CellMorphology" because now the reconstruction morphologies are a subtype of cell morphology. Because the different sub-clases of morphology were not made programmatic subclasses of cell morphology for the sake of maintaining compatibility with how this structure is processed, the extra data needed for the subclasses is in the "extended_data" field. A text description of all the different morphology components is in: https://openbraininstitute.sharepoint.com/:w:/r/sites/OBI-Scientificstaff/_layouts/15/Doc.aspx?sourcedoc=%7BEEE86C08-A09D-469E-9849-CD85599E4026%7D&file=General_schema.docx&action=default&mobileredirect=true&DefaultItemOpen=1