-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Biomarker ER diagram #11
Comments
Feb 6, 2023Darren:In order for me to assess 'core' information, I had to align this with the ontology as designed so far. This necessitated many changes, both to names, relationships, and connections (see attached): Name changes:
Connection removals:
Connection additions:
Moved:
To possibly revamp:
THE CORE:
Believe it or not, we don't have to capture the biomarker itself, as this gets built/inferred from#1 and#3 of the core list above. |
Feb 6, 2023Dan:Darren, thankyou for these notes and modifications. We’re also working on a summary document of some pilot work to automate acquisition of biomarker data from public resources and populate the data model that has been manually curated (and delivered for OBCI). Could you provide comments on the document at BiomarkerDB_Summary.docx We’ve done a pretty good job so far mapping the external data to our model as a basis for automation, but best_biomarker_type has been a challenge (absent in most sources examined). Any ideas on how to infer best_biomarker_type data from these sources, based on rules (maybe), would be a great help. |
Feb 10, 2023Raja:I still think we need a reference box. What is the normal range? Filling it in will be optional but most biomarkers can get a normal range. Dan - I think for CFDE work we need another diagram which strictly deals with molecular biomarkers. We need to scope our effort Darren:This seems like a can of worms to me. Strictly speaking, 'normal' is relative to an individual. I also don't think there's a way to actually use the information. Raja:Darren- I am not sure I understand what Chemical entity is supposed to mean. Isn't the chemical entity same as assessed biomarker entity? Darren:No. Some of the BEST biomarker types deal with exposures to (broadly stated) chemicals. In such cases the assessed biomarker entity is what indicates that a person has been exposed to the chemical entity. See Response and Safety biomarkers. Raja:Lets rename it to then environmental_exposure_entity |
Feb 10, 2023Darren:The reason I said it will be a can of worms is because once you introduce these ranges, it'll become an expectation to have them, and not one you'll be able to fill easily (I imagine it will involve a LOT of manual curation). If you're okay with that, then go for it. Indeed, I have zero hesitation about including this information in a database. But bear in mind that if we're talking about the ontology, these will be fully useless, which is to say there's no way to use them for reasoning or classification. But I definitely see that including them would be a selling point. As for 'chemical entity' I only used that term because it is what it will be in the ontology. It's the top-level term in CHEBI. All of the things I put in that revised figure are the actual names used in the ontology. 'environmental exposure entity' will, at best, be defined in terms of chemical entities anyway. |
Feb 10, 2023Dan:Agreed for curation of references; a can of worms on a practical level, but the question comes up alot in discussions and proposal reviews. Darren:I would opt for the less strict representation all around. A well-defined description I imagine would require relations like 'has_normal_upper_bound' and 'has_normal_lower_bound' and another for the measured units, but since these can't be used for anything other than information that seems like overkill. As mere information to be read by a human, a property value like 'has_normal_range' " to " will be easier for humans to process. Dan:Regarding chemical entities (strictly speaking), would this cover viruses, bacteria, and so forth? Darren:That's a good question. For sure 'chemical entity' does not cover organisms, but then again I'm not sure that's what's meant for the relevant BEST categories. These all refer to exposure "to a medical product or an environmental agent". We'll have to see what is meant by 'environmental agent'. I suspect these don't include organisms, though upon reflection these probably would include non-chemicals like radiation. We'd have to add something like that, so perhaps we could indeed use 'environmental agent' as the upper level, and this would include chemical entities from CHEBI plus those non-chemical agents. Then again, the real upper level would also have to include 'medical products', so 'environmental agent' would still be too restrictive. <spends time looking up 'environmental agent'> Okay, I found this: https://www.niehs.nih.gov/health/topics/agents/index.cfm I'd say that it DOES include organisms, at least on the surface. Dust mites and mold are listed, for example. That means we'll need to craft a definition for an upper level term that allows for medical products, chemicals, and organisms (though we can get away with a definition that just says the upper level term includes medical products and environmental agents, and then define environmental agents separately from a definition of medical products). |
Feb 10, 2023Darren:A biomarker is not a measurement, and the FDA/NIH doesn't say that it is. That's why I changed that relation, as it implies such. Rather, it is what we learn from measurement. For example, blood pressure is a measurement; increased blood pressure is a biomarker. Glucose concentration is a measurement; increased glucose concentration is a biomarker. For the sampled_from relation, the connection was made to biomarker as opposed to assessed biomarker entity because the latter has no context. You can't assert glucose sampled_from blood, because there are many places glucose can be sampled from, including for purposes wholly unrelated to clinical measurements. That being said, I agree that saying biomarker sampled_from biospecimen sounds odd, likely because I used that as a kind of shorthand. I can think of two possible fixes, the first of which keeps the notion of connecting the assessed entity to where the sample came from; it is rather complicated and breaks some reasoning. The second fix basically keeps the original design with a small modification; it is simple to implement:
I also realize that we might not be addressing the same purpose. I think you're addressing specifically what the FDA/NIH says about biomarkers, in which case, yes, they don't mention specimen and it would be non-core (though see this article of interest: https://www.ncbi.nlm.nih.gov/books/NBK566059/). As always, I'm thinking of the ontology and what would be needed to define biomarkers, in which case I'd say that the biospecimen is important. Indeed, in some cases it is absolutely essential (for example, a finding of white blood cells in urine has different indications than, say, increased WBC in blood). |
Feb 13, 2023Dan:Yes, right. A biomarker is not a measurement of an assessed entity. I was sloppy there. Darren:For sure, yes. That is why the relations between the biomarker and assessed entity includes directionality or presence/absence. Dan:So, in the ontology the directionality/presence/absence is expressed in the relation; in the data table, it's expressed in the data value. Darren:By 'data value' do you refer to the biomarker name? In the ontology it is given both in the name of the biomarker and in the relation. The relation is the more important of the two for ontology purposes, but in the end it doesn't matter where it's kept in the data table because the ontology can make use of it. Indeed, if desired, the data table technically can be just as useful without the biomarker name, as long as the directionality and the entity are given. Dan:Yes |
Feb 13, 2023Dan:compared to a (population) norm/reference value, of an assessed entity consistently associated with some particular circumstance/condition/process (e.g., disease)? Darren:As an approximation, yes. I say approximation because really one should compare against what is normal for self. This of course doesn't work for congenital issues. For all else a comparison to self would be the standard (even clinically I imagine), with the population average as a fall-back (for example, if a patient didn't have a baseline on record). Dan:Yes, but it may not necessarily work, I imagine, for an individual, strictly speaking; perhaps the "normal" state of an entity for an individual may change over time and differ at 60 yrs old vs. 25 yrs (e.g., blood pressure). This (ontology of biomarker) can get very complicated. Darren:Yep! Well, like I said, self-comparison is the gold standard in my opinion, fully aware it won't always be achievable. Not including these ranges removes all complications. In my initial modeling I went with a use case where the physician makes the judgment as to whether or not the assessed entity level is normal, using the ontology to help figure out the indication. I wasn't thinking that the physician would make a measurement and use the ontology to figure out if the measurement constituted a biomarker (above normal, below normal, present, absent). Dan:Agreed, I'm all for removing complications whenever possible. My concern is that, in some circumstances (not necessarily this one), information may be lost which might affect reasoning (good or bad). I raise the point because this seems to be a common comment from reviewers and others whenever talking about ontology modeling. Darren:As mentioned previously, we can include these ranges, though to me it seems more appropriate to capture these in the database since they can't be used in any way in the ontology. |
Feb 13, 2023Dan:This is one of the reasons why I (very mildly) objected to the notion of 'normal range'. I'm also wary of anything that can be used to make clinical interpretations. Including these ranges crosses that line, and we'd have to put disclaimers on every entity (this is what UniProt had to do). We might have to do that anyway once we connect biomarkers to disease. Darren:Agreed. I have heard some essentially suggest that a biomarker is not a biomarker if not used in a clinical setting; personally, I don't agree with that view. Dan:I don't have a strong opinion either way, but the wording of the FDA/NIH definition seems to agree that a clinical setting would be involved. I guess it comes down to what's considered a clinical setting. If 'clinical setting' is restricted to hospital or doctor's office or requires a doctor in some way, then I agree with you. I would consider an elevated temperature taken by my own thermometer just as valid as one used in the doctor's office. Darren:Agreed, but this idea of 'clinical setting' for a biomarker seems to come up alot from others. A strict interpretation of clinical setting would exclude home tests and even bench research, for example. Dan:I personally think it makes more sense to talk about 'clinical use' as opposed to 'clinical setting'. Either way, I don't how this would affect our work. |
Feb 13, 2023Dan:Investigators interpret/infer the observed different state to signify the potential or actual existence or some particular status/process. If so, generally, the relation of a biomarker-assessed entity is something like . Darren:I see where you're going with it, but that would be somewhat imprecise. In our treatment, biomarker has already built-in the notion of different-ness. A biomarker is indicated by the difference between a current observed state of some entity vs some previous observed state of that same entity (or, depending on the biomarker and as you point out above, between the current observed state vs some accepted standard). In all cases a biomarker is a comparative thing. Dan:Yes, always comparative. Although precisely defining the particular comparison can get tricky. |
Feb 13, 2023Dan:Although, I don’t think that’s very satisfactory; it doesn't really express the “significance” of a biomarker. Darren:Unclear what you mean by 'significance', unless you mean something like 'slightly above normal' vs 'greatly above normal'? That raises a point we haven't yet discussed: the possibility that slightly above and greatly above could have different indications (that is, point to different diseases). I hesitate to include such nuances because they are terribly difficult to define. Dan:I meant that asserting that the state of some entity differs from another state does not necessarily or inherently convey a notion of "biomarkerness". Darren:I assume here you mean that sometimes the difference is well within a normal range (whether population average or self average). In such cases these are still considered biomarkers though, per FDA definition "...measured as an indicator of normal biological processes...". I tend to think of biomarkers as anything that can be used--when abnormal--to indicate a potential medical issue, so if it is normal, then that's an indication that there's no issue. Dan:A relation observed_state_differs_from seems vague to me. Darren:We don't have such a relation currently, though it could be useful solely as an upper-level parent term for the more specific relations we currently have. It could be marked as 'do not use for annotation; there are a number of relations marked as such in the Relations Ontology. |
Feb 13, 2023Darren:And yes, different states of a biomarker for different diseases is a highly likely possibility, I think. Agreed, very difficult to define. Dan:Yes, I see your point about biosamples; agreed. So, we follow and also extend FNBWG, I guess. I wonder if there is a way to infer the location of (a specific biomarker) from its relation to an assessed entity, which is “sampled_in” a biospecimen? Darren:Can you elaborate? I don't think biomarkers have locations per se. That sampled_from (or, better, assessed_by_sampling) was intended as a kind of shortcut to say, "the biomarker (e.g., increased level of ) was determined by comparing levels of in samples from ". In any case, I don't think an inference can be made with respect to biospecimen; it is either known, or not. I suppose there could be cases where an assessed entity is ONLY found in some particular place, and even if not stated we'd know what that place is. Is that what you mean here? Dan:I gather you want to assert some relation of a biomarker with a biosample; more specifically, that the marker is observed/measured in some sample (an object obtained from some tissue/location for contextual knowledge)? Darren:Not necessarily for contextual knowledge. One of the use cases I can imagine is when a doctor has a vial of blood. What measurements can be taken from this sample? In another case, the same entity measured as abnormal in one sample might mean something different than that same entity measured as abnormal in a different sample. Dan:Interesting. Have we compiled a well-defined list of use cases yet? It might help bound the parameters (provide a common understanding of the imagined) uses of the ontology and its development. Darren:More work needs to be done in this area. Once I clear out some time-sensitive issues on my end, this is what I plan on tackling next. |
Dan:I wonder if that knowledge can be inferred (perhaps by a rule?) by the logical chain that: biomarker assessed entity Darren:That's what I tried the other day. Couldn't get it to work, but I might have done it incorrectly. Note that, even if it can be made to work, the chain must be within the context of the biomarker, not the assessed entity itself. That basically means that we can easily define a relation that directly connects between biomarker and biosample (for example, the aforementioned assessed_by_sampling) as meaning exactly that: Dan:Hmm; seems like the connection of a particular biomarker with a particular assessed entity is lost here (other than the notion that a biomarker (different state) was observed in a biospecimen for some unspecified entity and some condition? Does the reasoning work? Darren:I'm not sure what you mean here. We have a definite direct connection between biomarker and assessed entity. Do you mean that the connection between assessed entity and biospecimen is lost? In the modeling I have at the moment (subject to change as more examples are added), there is a connection between biomarker and assessed entity, between biomarker and biospecimen, and between biomarker and condition. The biomarker+entity connection is a matter of definition. In some cases, the definition might need to include the biospecimen (so, biomarker+entity+biospecimen). Under no circumstance would it be correct to make a connection between assessed entity+biospecimen outside the context of biomarker. That is to say, we can't say glucose sampled_from blood, because that would be asserting that glucose is found only in blood, which isn't true. We can, however, say 'blood glucose' sampled_from 'blood', and we can say 'blood glucose' is_a 'glucose', and 'glucose' is_a 'chemical entity'. I'll have to ruminate on the gains, losses, and potential complications of incorporating the sample into the biomarker definition. |
Feb 24, 2023Dan:Raja, Raja:Why is specimen type green? I am sure that OpenTarget and GWAS and ClinVar do not have specimens. Dan:I believe Darren argued that in some (not all) circumstances specimen type is 'core' in distinguishing biomarkers. |
March 2, 2023
Dan:
I have updated our existing biomarker ER diagram, highlighting “mandatory" items in the model that form the core definition of a biomarker, based on the FDA/NIH Biomarker Working Group BEST definition (below).
biomarker_er_model_02272023 (1).pptx
BiomarkerDB_Summary.docx
The text was updated successfully, but these errors were encountered: