Figure 1: AIPatient Structure
Motivation: Traditional medical education faces many challenges, including limited access to diverse clinical experiences, inconsistency in medical training, and high costs and limitated standardization in recruiting volunteers to serve as simulated patients. Integrating new technologies such as Large Language Models (LLM) can enhance the learning experience and improve training outcomes.
Overview: In this project, we developed AIPatient, an LLM powered simulated patient based on Electronic Health Records (EHR) data. Leveraging the MIMIC III dataset, which includes over 46k1 patients, we began by extracting relevant medical entities and their relationships to construct a comprehensive knowledge graph (KG). Next, we designed a multi-agent system and proposed the Reasoning RAG framework to accurately represent the information within the KG, ensuring minimal hallucination and high factual accuracy. By incorporating personalities, AIPatient can mimic real-life interactions, responding to questions and presenting symptoms in a manner similar to actual patients. In future iterations, we aim to incorporate evaluator agent to provide feedback on user performance, potentially enhancing medical training and ultimately improving patient care outcomes.
1: For the current iteration, AIPatient contains 56 unique cases. We plan to scale the patient pool in the future.
Figure 2: AIPatient Multi-Agent
Figure 2 presents the multi-agent system, designed with Reasoning RAG framework (Retrieval, Reasoning, Generation). Each rounds, agents interact to ensure accurate data retrieval and realistic generation. The system is also memory-perserving to ensure multi-round capabilities.
Figure 3: Knowledge Graph Construction with Electronic Health Records
Figure 3 provides an example of KG (right) constructed using EHR data (left). In the current KG, we focus on 12 node types (e.g. Admission, Symptom) and 11 relationships (e.g. HAS_SYMPTOM). The rich and diverse notes data in MIMIC III presents opportunities for mining additional medical entities and relationships.
Figure 4: AIPatient Interaction Example (single round)
In Figure 4, we present one round of user-AIPatient interaction. Beginning with user's natural language query input, the backend of AIPatient engages various agents and update the session state accordingly. Finally, the Simulated Patient answers the user's query using natural language.
conda create --name aipatient python=3.9
conda activate aipatient
git clone [https://github.com/huiziy/AIPatient.git]
The AIPatient interface is designed with streamlit. To run the app locally:
cd AIPatient_Interface
streamlit run AIPatient_Interface.py
The source code of AIPatient is licensed under Apache 2.0. The intended purpose is solely for research use.
AIPatient is powered by Anthropic's Claude 3.5 Sonnet via Amazon Bedrock to comform with the Responsible Data Use Agreement. We confirm the data is not shared with third parties, including sending it through APIs or using it in online platforms.
This repo contains code for agents and QA Interface; some data cleaning and knowledge graph creation code are omitted and will be made public after paper publication.



