Skip to content
This repository was archived by the owner on Aug 22, 2023. It is now read-only.

Commit

Permalink
final adjustments for handin
Browse files Browse the repository at this point in the history
  • Loading branch information
jannis-baum committed Jul 22, 2022
1 parent c691447 commit 442462b
Show file tree
Hide file tree
Showing 11 changed files with 149 additions and 155 deletions.
30 changes: 15 additions & 15 deletions chapters/1-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ reactions, adoption in real-world treatment of patients has been slow
With the goal of accelerating the adoption of \glsa{pgx} into treatment of
patients, we have worked on the service and research project *PharMe* in a team
of eight students from Hasso Plattner Institute at Prof. Böttinger's chair
Digital Health - Personalized Medicine. The implementation of PharMe I will be
referring to in this thesis can be found in its GitHub repository[^repo].
Digital Health - Personalized Medicine. The implementation of PharMe I refer to
in this thesis can be found in its GitHub repository[^repo].

[^repo]: [https://github.com/hpi-dhc/PharMe/tree/673c469](https://github.com/hpi-dhc/PharMe/tree/673c469)
[^repo]: [https://github.com/hpi-dhc/PharMe/tree/bbb9595](https://github.com/hpi-dhc/PharMe/tree/bbb9595)

PharMe is directed specifically at patients, i.e. people without professional
medical education, and aims to (1) be an educational resource by introducing its
Expand All @@ -29,11 +29,11 @@ available data given by multiple sources and display it to its users. The
cumulation and provision of this data is the main task of our backend system,
the Annotation Server.

Aside from data the Annotation Server fetches from preexisting \glspl{api} of
external medical organizations that is directed at field professionals, it needs
to provide guidelines that are comprehensible to patients. These guidelines are
created by \glsa{pgx} experts at the Icahn School of Medicine at Mount Sinai who
manually curate them.
The Annotation Server fetches data that is directed at field professionals from
existing \glspl{api} of external medical organizations. Aside from this data, it
also needs to provide guidelines that are comprehensible to patients. These
guidelines are created by \glsa{pgx} experts at the Icahn School of Medicine at
Mount Sinai who manually curate them.

In the current implementation of the Annotation Server as we have built it in
shared effort, some problems have remained:
Expand All @@ -47,14 +47,14 @@ shared effort, some problems have remained:
in turn also need to be matched with the external data.
- There is no infrastructure to provide guidelines in multiple languages.

\noindent To attend these problems, I propose, implement and test a new
component to the PharMe system in this thesis: the Annotation Interface. The
Annotation Interface is a web application directed at \glsa{pgx} experts and
aims to facilitate the process of researching, curating and administering
patient-oriented, multilingual \glsa{pgx} information.
\noindent To attend these problems, this thesis researches a new component to
the PharMe system: the Annotation Interface. The Annotation Interface is a web
application directed at \glsa{pgx} experts and aims to facilitate the process of
researching, curating and administering patient-oriented, multilingual
\glsa{pgx} information.

In this thesis, I give an overview of the Annotation Server's implementation.
After explaining the before mentioned problems in more detail, I propose,
In this thesis, I give an overview of the Annotation Server's implementation and
explain the before mentioned problems in more detail. Subsequently, I propose,
implement and test the Annotation Interface as a solution to these problems.
Finally, I discuss future work on administration of information for PharMe based
on the presented findings.
57 changes: 27 additions & 30 deletions chapters/2-annotation-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ consists of two main groups of information:
1. Drugs -- users should be able to search for and find any drug
they want to consult PharMe about; regardless of whether there actually are
any \glsa{pgx} findings for this drug.
2. \Glspl{guideline} - for drugs that do have \glsa{pgx} findings, users should
2. \Glspl{guideline} -- for drugs that do have \glsa{pgx} findings, users should
be presented with
a) simplified information that is comprehensible for them and
b) more detailed and complete information they can show to their doctor.

\noindent To provide information on drugs (1.), we use a database called
\gls{drugbank}, which is one of the world's most used resources for drug
information [@wishart_drugbank_2018]. Using their academic license, we can
information [@wishart_drugbank_2018]. With their academic license, we can
download an XML file from \gls{drugbank} that contains all the information we
use from them. Additionally, \glsa{pgx} experts annotate drugs that have
\gls{pgx} relevance with descriptive texts that are comprehensible for patients.
Expand All @@ -32,8 +32,8 @@ implementation is English.
## Technical overview

The Annotation Server is a web application built with NestJS, "a framework for
building efficient, scalable [...] server-side applications" with support for
TypeScript [@kamil_mysliwiec_nestjs_nodate], the programming language the
building efficient, scalable [...] server-side applications [with support for
TypeScript]" [@kamil_mysliwiec_nestjs_nodate], the programming language the
majority of PharMe's backend is built with. For storage of data, the Annotation
Server relies on PostgreSQL, "the world's most advanced open source relational
database" [@the_postgresql_global_development_group_postgresql_2022] and TypeORM
Expand All @@ -52,24 +52,22 @@ ER-Diagram of the database structure corresponding with these modules.
database\label{er-diagram}](images/as-database.pdf)

The `medications` module is responsible for all data regarding drugs. It stores
this data by maintaining a drug table in the Annotation Server's database. This
table is initialized by loading and saving all relevant data from the
\gls{drugbank} XML file. This data consists of drug names, descriptions,
synonyms and \glspl{rxcui}, which are unique identifiers given by the National
Library of Medicine's standardized drug nomenclature RxNorm [@liu_rxnorm_2005]
Once the drug table is initialized, the Annotation Server provides GET endpoints
to retrieve information for one or multiple drugs along with the option of
applying filters.
drug data by maintaining a table in the Annotation Server's database. This table
is initialized by loading and saving all relevant data from the \gls{drugbank}
XML file. The saved data consists of drug names, descriptions, synonyms and
\glspl{rxcui}, which are unique identifiers given by the National Library of
Medicine's standardized drug nomenclature RxNorm [@liu_rxnorm_2005]. Once the
drug table is initialized, the Annotation Server provides GET endpoints to
retrieve information for one or multiple drugs along with the option of applying
filters.

The `phenotypes` module maintains all the phenotypes \gls{cpic} offers
guidelines for in its phenotype table. These phenotypes are defined by a gene
symbol such as *CYP2D6* and the effect variants with this phenotype have on the
gene, i.e. a gene result such as *Normal metabolizer*. Aside from these
identifying properties, some additional data \gls{cpic} provides about
phenotypes is also stored. The `phenotypes` module exposes no dedicated
endpoints as it is only used in relation to the `guidelines` module. The loading
of phenotype data from \gls{cpic}'s \gls{api} is invoked by the `guidelines`
module when it initializes its own data.
identifying properties, some additional \gls{cpic} data about phenotypes is also
stored. The `phenotypes` module exposes no dedicated endpoints as it is only
used in relation to the `guidelines` module.

The `guidelines` module keeps \glspl{guideline} in relation to phenotypes and
drugs in its guidelines table. This data is initialized by loading all of
Expand All @@ -92,29 +90,28 @@ experts. These \glspl{annotation} consist of
\gls{implication}'s consequences, and a \gls{warnl}, expressing the severity
of the \gls{recommendation} as one of three tiers, for \glspl{guideline}.

\noindent In the Annotation Server's current implementation, the \glsa{pgx}
experts provide this data through a shared online Google Sheet. On request, the
In the Annotation Server's current implementation, the \glsa{pgx} experts
provide this data through a shared online Google Sheet. On request, the
Annotation Server automatically downloads and processes this Google Sheet to
annotate all data that matches the existing external data. Here, matching data
is determined by the drug names, gene symbols and gene results the \glsa{pgx}
experts manually write into the Google Sheet being equal to the analogous data
from \gls{cpic} and \gls{drugbank} stored on the Annotation Server. Errors
resulting from mismatches are, again, stored as `GuidelineError`s.
annotate all data that matches the existing external data. Here, a data match is
given when the drug names, gene symbols and gene results the \glsa{pgx} experts
manually write into the Google Sheet are equal to the analogous data from
\gls{cpic} and \gls{drugbank} stored on the Annotation Server. Errors resulting
from mismatches are, again, stored as `GuidelineError`s.

## Data administration process and shortcomings \label{as-eval}

The implementation of the Annotation Server relies on two parties to
initialize its data and keep it up-to-date:

- **A curating party** with sufficient \gls{pgx} expertise to curate
patient-oriented \glspl{annotation} from data they manually research from
sources such as \gls{cpic}. This party manually writes their
\glspl{annotation} into the Google Sheet, initially without any feedback
regarding if and how well they match the Annotation Server's existing data
from external sources.
patient-oriented \glspl{annotation} from data they research from sources such
as \gls{cpic}. This party manually writes their \glspl{annotation} into the
Google Sheet, initially without any feedback regarding if and how well they
match the Annotation Server's existing data from external sources.
- **A maintaining party** with sufficient technical knowledge to invoke the
requests to fetch data from external sources and the Google
Sheet. This party also oversees the before mentioned `GuidelineErrors` and
Sheet. This party also oversees the before mentioned `GuidelineError`s and
acts accordingly, which usually results in notifying the curating party to
make necessary adjustments.

Expand Down
12 changes: 6 additions & 6 deletions chapters/3-annotation-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@ This chapter describes the conceptualization, implementation, and expert testing
of a new component of the PharMe system: the Annotation Interface. The
Annotation Interface aims to solve the issues discussed in Chapter \ref{as-eval}
by improving the overall experience and efficiency of the process of
researching, curating \glspl{annotation} and maintaining the Annotation Server's
data.
researching and curating \glspl{annotation} and maintaining the Annotation
Server's data.

The core idea of the Annotation Interface is to give full control over data to
the curating party, eliminating the communication overhead and the need for a
second party involved in maintenance of data. On top of this, the Annotation
Interface conceptualizes and partly implements automation into the curating
party's research and curation process to further increase efficiency. Finally,
it implements an approach towards operating PharMe with support for multiple
second party involved in data maintenance. Moreover, the Annotation Interface
conceptualizes and partly implements automation into the curating party's
research and curation process to further increase efficiency. Finally, it
implements an approach towards operating PharMe with support for multiple
languages by modularizing \glspl{annotation}.
4 changes: 2 additions & 2 deletions chapters/3_1-concept.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ that are used to create and maintain \glspl{annotation}.
The second major feature of the concept for the Annotation Interface revolves
around modularizing \glspl{annotation}. This modularization is achieved by
strictly limiting the creation of \glspl{annotation} to combinations of
\glspl{brick}: predefined textual prototypes or templates that adjust to the
\glspl{brick}: predefined textual components or templates that adjust to the
\gls{annotation} they are used in. One such \gls{brick} might be

> `#drug.name` may not be the right medication for you.
Expand All @@ -33,5 +33,5 @@ clicks on the first \gls{brick} to get a visual explanation of why it was
suggested.

![Conceptualized suggestion of \glspl{brick} based on \gls{cpic} guideline
(left) and visual explanation of why on of the \glspl{brick} was suggested
(left) and visual explanation of why one of the \glspl{brick} was suggested
(right) \label{nlp-mockup}](images/nlp-mockup.pdf)
Loading

0 comments on commit 442462b

Please sign in to comment.