Similar Patient Discovery #116

inodb · 2024-03-26T13:05:43Z

Background:
cBioPortal is an open-source platform designed to provide a web interface for exploring, visualizing, and analyzing cancer genomics data, and has grown to be widely used by researchers and clinicians worldwide. The current interface provides comprehensive tools for individual patient data exploration, including mutations, copy number variations, and clinical information as well as cohort exploration, analytics, and cohort comparisons. A user can find similar patients by using the interface to look for patients that e.g. are of the same cancer type, have similar mutations, or received the same treatment. There are currently however no similar patients proposed automatically; finding similar ones requires many manual steps. Here, we propose to develop a new web service that would recommend similar patients a user could explore given a patient's molecular and clinical profile. In oncology, where genetic mutations and biomarkers play critical roles in determining the most effective treatments, the ability to easily find and compare similar patient cases is invaluable. Moreover, a patient similarity function within cBioPortal would empower users to leverage the vast amounts of data available in the portal more effectively. By integrating sophisticated similarity search capabilities, users could identify cohorts of patients based on specific criteria, compare their genomic landscapes, and analyze their treatment outcomes.

Goal:
Develop a REST API that provides patient similarity information given a patient's molecular and clinical profile. For the similarity scoring we will use an existing algorithm

Approach:
We will develop a backend web service for an existing Python-based algorithm that generates a model for identifying similar patients. This web service will provide a RESTful API to allow for communication of the cBioPortal frontend with the patient similarity model. These endpoints will be designed to handle real-time data exchanges, leveraging JSON for its versatility and efficiency in data transmission. To manage data updates to the patient similarity model whenever new cBioPortal data is added to the system we propose to leverage event-driven triggers. When new data enters the system, we rerun the pipeline to regenerate the model and redeploy the backend web service Whenever a user visits the frontend page it will be using this new backend web service. This ensures that the frontend displays the most current data, enhancing the user experience in exploring patient similarities. Additionally, to maintain system efficiency and prevent overload, it's crucial to optimize the data payload and update frequency based on user interaction and system capabilities

Need skills:
Understanding of RESTful APIs, Familiarity with Python

Possible mentors:
@Thahmina

domgor11 · 2024-03-27T13:04:35Z

Hi @inodb ,
I'm a backend software engineer with 2 year of experience, and will be starting masters in Health Data Science this fall.
I have experience in building cloud backend services with APIs that focus on processing large volumes of incoming data.
The described approach sounds to me fairly straightforward, and I'm confident that I'm capable of creating this backend service.

LinkedIn: https://www.linkedin.com/in/dominika-gorgosz
Email: [email protected]

It would be great to have a call, or discuss your precise expectations over email.

DininduChamikara · 2024-03-29T16:11:13Z

Hi @inodb,
I am a fresh graduate and like to contribute to this project as a participant in the GSoC 2024. I have worked with some Natural Language Processing tasks and in there, I have worked with clustering models as well. In this project, I need to clarify some details.
* Is this approach based on clustering or classification?
* "We will develop a backend web service for an existing Python-based algorithm that generates a model for identifying similar patients." what does this mean? Does it mean there is an existing algorithm and no need to implement that?
* Do we need to complete some machine learning part here?
* If not the only part to implement is when a patient enters some data find the relevant cluster they are in and update the data set with the newly entered data.

Email - [email protected]

mdkintu · 2024-03-29T23:46:33Z

hi @inodb
where do i submit the proposal?

inodb added Python Size: Large (350h) Difficulty: Hard GSoC-2024 GSoC 2024 Candidate Projects labels Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Similar Patient Discovery #116

Similar Patient Discovery #116

inodb commented Mar 26, 2024 •

edited

Loading

domgor11 commented Mar 27, 2024

DininduChamikara commented Mar 29, 2024 •

edited

Loading

mdkintu commented Mar 29, 2024 •

edited

Loading

Similar Patient Discovery #116

Similar Patient Discovery #116

Comments

inodb commented Mar 26, 2024 • edited Loading

domgor11 commented Mar 27, 2024

DininduChamikara commented Mar 29, 2024 • edited Loading

mdkintu commented Mar 29, 2024 • edited Loading

inodb commented Mar 26, 2024 •

edited

Loading

DininduChamikara commented Mar 29, 2024 •

edited

Loading

mdkintu commented Mar 29, 2024 •

edited

Loading