Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biomarker Knowledge Graph on AWS #149

Open
DaniallMasood opened this issue Jul 1, 2024 · 4 comments
Open

Biomarker Knowledge Graph on AWS #149

DaniallMasood opened this issue Jul 1, 2024 · 4 comments
Assignees
Labels

Comments

@DaniallMasood
Copy link
Member

Documentation to setting up instances of KG on AWS: https://github.com/clinical-biomarkers/biomarker-partnership/tree/main/supplementary_files/documentation

Goal: Create knowledge graphs and cypher queries based on biological questions and use cases

@rykahsay
Copy link

Specific Aim 1: Create a capable and secure API interface for use in executing and accessing query results on the DD KG. Develop API endpoints that address most common use cases. Test the API for supporting queries and data extraction for specific use cases.

Specific Aim 2: Develop and implement a community-accessible website to query data in the DD graph that supports selected use cases from the DCCs. Tune the DD database to be ready for website data delivery, including importing new datasets to support website-driven use cases. This will include ingestion of necessary supporting data to extend the current KG functionality.

Specific Aim 3: Create algorithms and protocols to show how machine learning can be applied on the KG database, including link prediction, community detection, and knowledge cross-validation

@jeet-vora
Copy link

The above aims are good but they are more for the future. But right now what Raja wants you to do is -

  • Play with already installed Knowledge Graph in AWS and see how it works (Miguel has installed it - Documentation)
  • From the existing data in the knowledge graph come with some usecase and new queries
  • Evaluate if we can implement a knowledge graph in GlyGen and how it compare with the supersearch. Can we also have Cypher queries like SPARQL
  • LINCS AVIs team is working on the interface that can be implemented in Biomarker

Below is the email sent by Raja
Hi Robel,
Jeet is overall coordinating our involvement with Data Distiallary which is a Knowledge Graph (KG) project. We have downloaded the KG and we have tutorials on how to query it using Cypher query language. KG is accessible from AWS. We need the following where we can use your help/input

  • device some query and output that can be included in our biomarker paper or a GlyGen paper (Daniall is leading this effort with help from others)
  • integrate KG app from Avi into GlyGen or biomarker interface (Sujeet and Sean are leading this)
    Your work on API structure and data sites has already been very helpful. Once you have some familiarity with KG maybe we can meet. The project ends soon so we would like to have some work done by Aug end.

Jeet and Daniall - can you please meet with Robel and discuss? Also, send Robel the proposal.

@MiguelMazumder
Copy link
Contributor

MiguelMazumder commented Jul 23, 2024

Knowledge Graph has been set up on the AWS server. To run the Docker container docker group access and vpn is required, navigate to data/KnowledgeGraph directory and run bash ./run_container.sh. Neo4j user interface will then be available at aws.glygen.org/neo4j. I will create more detailed documentation about this process

Use Case and query Ideas:

  1. Gene-Disease Associations
    Query: Find all diseases associated with a particular gene.

  2. Protein-Protein Interactions
    Query: Find all proteins that interact with a particular protein.

  3. Disease-Chemical Compound Relationships
    Query: Find all chemical compounds associated with a particular disease.

  4. Hierarchical Relationships within Body Structures
    Query: Find all body structures that are subparts of a particular structure.

  5. Gene-Gene Product Relationships
    Query: Find all gene products produced by a particular gene.

  6. Body Structure-Disease Relationships
    Query: Find diseases that affect specific body structures.

  7. Synonym Relationships (Cross-references)
    Query: Find all equivalent identifiers for a specific entity.

  8. Multi-domain Relationships
    Query: Find relationships that span multiple domains, such as genes in one ontology linked to diseases in another.

  9. Entity-Annotation Relationships
    Query: Find all annotations associated with a particular entity.

  10. @DaniallMasood Biomarker Drug relationship
    *if IDG has any specific information with entity-drug relationship

@jeet-vora jeet-vora added the DD label Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants