Updated Biomarker ER diagram #11

Reeya123 · 2023-04-25T20:36:00Z

March 2, 2023

Dan:

I have updated our existing biomarker ER diagram, highlighting “mandatory" items in the model that form the core definition of a biomarker, based on the FDA/NIH Biomarker Working Group BEST definition (below).
biomarker_er_model_02272023 (1).pptx
BiomarkerDB_Summary.docx

Reeya123 · 2023-04-25T20:52:02Z

Feb 6, 2023

Darren:

In order for me to assess 'core' information, I had to align this with the ontology as designed so far. This necessitated many changes, both to names, relationships, and connections (see attached):

Name changes:

biomarker measurement -> biomarker
measure_of_entity -> indicated_by_<inc,dec,pres,abs> (shorthand for increased_level_of, decreased_level_of, presence_of, absence_of respectively)
disease_name -> medical condition (this allows for non-disease biomarkers)
detected_in -> sampled_from
specimen type -> biospecimen
biomarker_of -> provides_clinical_information_for (when biomarker is connecting to medical condition) or indicates_clinical_effect_of (see additions below); note that these are upper-level relations, and that curation needs to say which of the more-specific relations should be used: prognostic_for, diagnostic_for, monitors_status_of, indicates_risk_of_developing, predicts_effect_of, monitors_effect_of, indicates_response_to, and assesses_toxicity_of.
is_BEST_type -> is_a (there's no specific relation needed, as these are inferred from the relation between biomarker and either medical condition or chemical entity)
has_entity_type -> is_a (there's no specific relation needed, as these are inferred from the ontological hierarchy for the assessed biomarker entity)

Connection removals:

associated_with (between assessed biomarker entity and what is now medical condition). The entity itself says nothing, only the biomarker does.
assayed_in (redundant with sampled_from, which is between the biomarker and the specimen, but see 'dotted line' note below)
occurs_in (between medical condition and biosample). The medical condition need not be located where a sample was taken from.
is_BEST_type_for (between BEST biomarker type and either medical condition or chemical entity). Somewhat superfluous, as this information is captured a different way.

Connection additions:

indicates_clinical_effect_of (between biomarker and chemical entity; some BEST types have to do with exposures)
brackets for literature evidence (not sure if a relation is used for this; will look into); we might be able to target specific statements about biomarkers, like "has_LOINC some LOINC code [PMID:123456789]"

Moved:

assessed entity type (original line looked like it might be coming from biomarker instead of assessed biomarker entity)

To possibly revamp:

Things like blood pressure or heart rate are measured but not biospecimens per se. Will need to add a class for these (characteristic?), and probably also a new relation (I think I have these somewhere, but they are not yet in the ontology file).
Currently the 'sampled_from' relation is between biomarker and biospecimen, but need to consider making the relation between the assessed entity and the biospecimen.

THE CORE:

assessed biomarker entity
the medical condition or chemical entity that the biomarker intends to inform on
the indication; this is what gets built in to 'biomarker' (eg, 'increased level of')
the type of relation between the biomarker and medical condition/chemical entity (this will be a BEST-ish relation, like prognostic_for; see note for#6 under 'changes')
biospecimen? I imagine this would be important for some types of diseases, or even stages of disease

Believe it or not, we don't have to capture the biomarker itself, as this gets built/inferred from#1 and#3 of the core list above.

Reeya123 · 2023-04-25T20:59:57Z

Feb 6, 2023

Dan:

Darren, thankyou for these notes and modifications.

We’re also working on a summary document of some pilot work to automate acquisition of biomarker data from public resources and populate the data model that has been manually curated (and delivered for OBCI). Could you provide comments on the document at BiomarkerDB_Summary.docx

We’ve done a pretty good job so far mapping the external data to our model as a basis for automation, but best_biomarker_type has been a challenge (absent in most sources examined). Any ideas on how to infer best_biomarker_type data from these sources, based on rules (maybe), would be a great help.

Reeya123 · 2023-04-25T21:07:55Z

Feb 10, 2023

Raja:

I still think we need a reference box. What is the normal range? Filling it in will be optional but most biomarkers can get a normal range.

Dan - I think for CFDE work we need another diagram which strictly deals with molecular biomarkers. We need to scope our effort
Dan for all of the boxes we need an example ID types or ontology we are going to use. For medical condition we can say e.g. DO, HPO

Darren:

This seems like a can of worms to me. Strictly speaking, 'normal' is relative to an individual. I also don't think there's a way to actually use the information.
Everytime I talk about this work people ask me why no reference ranges for things that obviously has them. If you want to make this work clinically relevant also it might be useful to have a box (dotted is fine). If you see markerDB they have reference ranges. In real world clinicians rely on normal ranges. If we are writing a proposal this will be a point that will be discussed that might weaken our proposal. Also, it is possible for some of the biomarkers we will be able to get this data by mining EHR data. We can keep this out for now if both of you disagree. But if more people ask for it we need to address this.

Raja:

Darren- I am not sure I understand what Chemical entity is supposed to mean. Isn't the chemical entity same as assessed biomarker entity?

Darren:

No. Some of the BEST biomarker types deal with exposures to (broadly stated) chemicals. In such cases the assessed biomarker entity is what indicates that a person has been exposed to the chemical entity. See Response and Safety biomarkers.

Raja:

Lets rename it to then environmental_exposure_entity

Reeya123 · 2023-04-25T21:13:05Z

Feb 10, 2023

Darren:

The reason I said it will be a can of worms is because once you introduce these ranges, it'll become an expectation to have them, and not one you'll be able to fill easily (I imagine it will involve a LOT of manual curation). If you're okay with that, then go for it. Indeed, I have zero hesitation about including this information in a database. But bear in mind that if we're talking about the ontology, these will be fully useless, which is to say there's no way to use them for reasoning or classification. But I definitely see that including them would be a selling point.

As for 'chemical entity' I only used that term because it is what it will be in the ontology. It's the top-level term in CHEBI. All of the things I put in that revised figure are the actual names used in the ontology. 'environmental exposure entity' will, at best, be defined in terms of chemical entities anyway.

Reeya123 · 2023-04-26T14:19:28Z

Feb 10, 2023

Dan:

Agreed for curation of references; a can of worms on a practical level, but the question comes up alot in discussions and proposal reviews.
So some sort of response is needed (perhaps depending on context). For example, perhaps in a proposal use a well-defined description and terminology, but for a 1-page summary use a somewhat less strict representation.

Darren:

I would opt for the less strict representation all around. A well-defined description I imagine would require relations like 'has_normal_upper_bound' and 'has_normal_lower_bound' and another for the measured units, but since these can't be used for anything other than information that seems like overkill. As mere information to be read by a human, a property value like 'has_normal_range' " to " will be easier for humans to process.

Dan:

Regarding chemical entities (strictly speaking), would this cover viruses, bacteria, and so forth?

Darren:

That's a good question. For sure 'chemical entity' does not cover organisms, but then again I'm not sure that's what's meant for the relevant BEST categories. These all refer to exposure "to a medical product or an environmental agent". We'll have to see what is meant by 'environmental agent'. I suspect these don't include organisms, though upon reflection these probably would include non-chemicals like radiation. We'd have to add something like that, so perhaps we could indeed use 'environmental agent' as the upper level, and this would include chemical entities from CHEBI plus those non-chemical agents. Then again, the real upper level would also have to include 'medical products', so 'environmental agent' would still be too restrictive.

Okay, I found this: https://www.niehs.nih.gov/health/topics/agents/index.cfm

I'd say that it DOES include organisms, at least on the surface. Dust mites and mold are listed, for example. That means we'll need to craft a definition for an upper level term that allows for medical products, chemicals, and organisms (though we can get away with a definition that just says the upper level term includes medical products and environmental agents, and then define environmental agents separately from a definition of medical products).

Reeya123 · 2023-04-26T14:21:17Z

Feb 10, 2023

Dan:

I’m working on getting our discussions into a github repo and I’ve added a comment to the figure legend about reference ranges.

For a formal representation of biomarkers, based on the FDA/NIH definition, I think we will need some tweaks to Darren’s model (below, and attached).

For example, some ‘biomarker’ (a measure) sampled_from some ‘biospecimen’ would have a different semantics than some ‘assessed entity’ (an object) sampled_from some ‘biospecimen’. I think we (and FDA/NIH) intend the meaning to be some , not some . And the FDA/NIH definition says nothing about a ‘biospecimen’ thing; so, I think it also would be a very important contextual annotation, but not a “core” element.

Also, some ‘biomarkers’ are an indicator of some biological process (of which ‘medical condition’ is_a child), while some ‘biomarkers’ are an indicator of response to some ‘agent’ (more general of course than ‘environmental agent‘).

Some of these entities and relations may be too broad, which begs the question of scope; meaning that further modifications of the figure may be needed if the scope is to be limited to molecular, for example.

Reeya123 · 2023-04-26T14:25:22Z

Feb 10, 2023

Darren:

A biomarker is not a measurement, and the FDA/NIH doesn't say that it is. That's why I changed that relation, as it implies such. Rather, it is what we learn from measurement. For example, blood pressure is a measurement; increased blood pressure is a biomarker. Glucose concentration is a measurement; increased glucose concentration is a biomarker. For the sampled_from relation, the connection was made to biomarker as opposed to assessed biomarker entity because the latter has no context. You can't assert glucose sampled_from blood, because there are many places glucose can be sampled from, including for purposes wholly unrelated to clinical measurements. That being said, I agree that saying biomarker sampled_from biospecimen sounds odd, likely because I used that as a kind of shorthand. I can think of two possible fixes, the first of which keeps the notion of connecting the assessed entity to where the sample came from; it is rather complicated and breaks some reasoning. The second fix basically keeps the original design with a small modification; it is simple to implement:

Incorporate the sample into the biomarker definition (see below for why this might need to be done). So, instead of saying " = biomarker and indicated_by_increased_level_of and sampled_from " (which, effectively, can be separated into two statements, both about ), we'd say " = biomarker and indicated_by_increased_level_of ( sampled_from ). Perhaps that is what you meant, but the figure wasn't capturing that (nor can I figure out a way to represent that pictorially, hence why I made the connection how I did). Doing this prevents us from reasoning that the biomarker was assessed by sampling blood (I just tried it). This brings me to what I think is a better and simpler fix:
Change the name of the relation. Instead of 'sampled_from', we use something like 'assessed_by_sampling'. Indeed, we can have both, one for connecting to the assessed entity and the other for connecting to the biomarker. I suspect having both will be mildly confusing and perhaps needlessly complicating, but with some work it might do what is needed.

I also realize that we might not be addressing the same purpose. I think you're addressing specifically what the FDA/NIH says about biomarkers, in which case, yes, they don't mention specimen and it would be non-core (though see this article of interest: https://www.ncbi.nlm.nih.gov/books/NBK566059/). As always, I'm thinking of the ontology and what would be needed to define biomarkers, in which case I'd say that the biospecimen is important. Indeed, in some cases it is absolutely essential (for example, a finding of white blood cells in urine has different indications than, say, increased WBC in blood).

Reeya123 · 2023-04-26T14:27:41Z

Feb 13, 2023

Dan:

Yes, right. A biomarker is not a measurement of an assessed entity. I was sloppy there.
Would you agree that a biomarker (to paraphrase FNBWG) is an observable (measurable) different state (in a sample from a subject),

Darren:

Yes, always comparative. Although precisely defining the particular comparison can get tricky.

Reeya123 · 2023-04-26T14:45:20Z

Feb 13, 2023

Reeya123 · 2023-04-26T14:52:10Z

Feb 13, 2023

Darren:

And yes, different states of a biomarker for different diseases is a highly likely possibility, I think. Agreed, very difficult to define.

Dan:

Yes, I see your point about biosamples; agreed. So, we follow and also extend FNBWG, I guess.

Dan:

I wonder if that knowledge can be inferred (perhaps by a rule?) by the logical chain that:

biomarker assessed entity
assessed entity biosample
therefore biomarker <some relation*> biosample
*distinct relations

Darren:

That's what I tried the other day. Couldn't get it to work, but I might have done it incorrectly. Note that, even if it can be made to work, the chain must be within the context of the biomarker, not the assessed entity itself. That basically means that we can easily define a relation that directly connects between biomarker and biosample (for example, the aforementioned assessed_by_sampling) as meaning exactly that:
biomarker assessed_by_sampling biospecimen means that there is some assessed biomarker entity that was sampled from the given biospecimen.

Dan:

Hmm; seems like the connection of a particular biomarker with a particular assessed entity is lost here (other than the notion that a biomarker (different state) was observed in a biospecimen for some unspecified entity and some condition? Does the reasoning work?

Darren:

I'm not sure what you mean here. We have a definite direct connection between biomarker and assessed entity. Do you mean that the connection between assessed entity and biospecimen is lost? In the modeling I have at the moment (subject to change as more examples are added), there is a connection between biomarker and assessed entity, between biomarker and biospecimen, and between biomarker and condition. The biomarker+entity connection is a matter of definition. In some cases, the definition might need to include the biospecimen (so, biomarker+entity+biospecimen). Under no circumstance would it be correct to make a connection between assessed entity+biospecimen outside the context of biomarker. That is to say, we can't say glucose sampled_from blood, because that would be asserting that glucose is found only in blood, which isn't true. We can, however, say 'blood glucose' sampled_from 'blood', and we can say 'blood glucose' is_a 'glucose', and 'glucose' is_a 'chemical entity'. I'll have to ruminate on the gains, losses, and potential complications of incorporating the sample into the biomarker definition.

Reeya123 · 2023-04-26T14:59:51Z

Feb 24, 2023

Dan:

Raja,
I've revised the biomarker ER diagram and legend based on recent discussions. The attached file has two versions of the diagram; with entity examples and without. I'll send it to Darren today for comments/edits. Do you want to have a look before I send?

Raja:

Why is specimen type green? I am sure that OpenTarget and GWAS and ClinVar do not have specimens.

Dan:

I believe Darren argued that in some (not all) circumstances specimen type is 'core' in distinguishing biomarkers.
I think his example was WBC in urine vs. blood.
I colored specimen type green to avoid complicating the figure with nuance.
Would you prefer some type of distinction for specimen type?

Reeya123 closed this as completed Apr 25, 2023

Reeya123 reopened this Apr 25, 2023

Reeya123 assigned danlymangw Apr 26, 2023

Reeya123 closed this as completed Apr 26, 2023

Reeya123 reopened this Apr 26, 2023

Reeya123 closed this as completed Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated Biomarker ER diagram #11

Updated Biomarker ER diagram #11

Reeya123 commented Apr 25, 2023

Reeya123 commented Apr 25, 2023 •

edited

Loading

Reeya123 commented Apr 25, 2023

Reeya123 commented Apr 25, 2023 •

edited

Loading

Reeya123 commented Apr 25, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023 •

edited

Loading

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Reeya123 commented Apr 26, 2023

Updated Biomarker ER diagram #11

Updated Biomarker ER diagram #11

Comments

Reeya123 commented Apr 25, 2023

March 2, 2023

Dan:

Reeya123 commented Apr 25, 2023 • edited Loading

Feb 6, 2023

Darren:

Reeya123 commented Apr 25, 2023

Feb 6, 2023

Dan:

Reeya123 commented Apr 25, 2023 • edited Loading

Feb 10, 2023

Raja:

Darren:

Raja:

Darren:

Raja:

Reeya123 commented Apr 25, 2023

Feb 10, 2023

Darren:

Reeya123 commented Apr 26, 2023

Feb 10, 2023

Dan:

Darren:

Dan:

Darren:

Reeya123 commented Apr 26, 2023

Feb 10, 2023

Dan:

Reeya123 commented Apr 26, 2023

Feb 10, 2023

Darren:

Reeya123 commented Apr 26, 2023

Feb 13, 2023

Dan:

Darren:

Dan:

Darren:

Dan:

Reeya123 commented Apr 26, 2023

Feb 13, 2023

Dan:

Darren:

Dan:

Darren:

Dan:

Darren:

Reeya123 commented Apr 26, 2023

Feb 13, 2023

Dan:

Darren:

Dan:

Darren:

Dan:

Reeya123 commented Apr 26, 2023

Feb 13, 2023

Dan:

Darren:

Dan:

Reeya123 commented Apr 26, 2023 • edited Loading

Feb 13, 2023

Dan:

Darren:

Dan:

Darren:

Dan:

Darren:

Reeya123 commented Apr 26, 2023

Feb 13, 2023

Darren:

Dan:

Darren:

Dan:

Darren:

Dan:

Darren:

Reeya123 commented Apr 26, 2023

Dan:

Reeya123 commented Apr 25, 2023 •

edited

Loading

Reeya123 commented Apr 25, 2023 •

edited

Loading

Reeya123 commented Apr 26, 2023 •

edited

Loading