Unlocking Electronic Health Records: A Hybrid Graph RAG Approach to Safe Clinical AI for Patient QA
- URL: http://arxiv.org/abs/2602.00009v1
- Date: Thu, 27 Nov 2025 16:08:22 GMT
- Title: Unlocking Electronic Health Records: A Hybrid Graph RAG Approach to Safe Clinical AI for Patient QA
- Authors: Samuel Thio, Matthew Lewis, Spiros Denaxas, Richard JB Dobson,
- Abstract summary: Large Language Models offer transformative potential for data processing, but face limitations in clinical settings.<n>Current solutions typically isolate retrieval methods focusing on structured data (Text2Cypher) or unstructured semantic search but fail to integrate both simultaneously.<n>This work presents MediGRAF, a novel hybrid Graph RAG system that bridges this gap.
- Score: 1.9615061725959186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Electronic health record (EHR) systems present clinicians with vast repositories of clinical information, creating a significant cognitive burden where critical details are easily overlooked. While Large Language Models (LLMs) offer transformative potential for data processing, they face significant limitations in clinical settings, particularly regarding context grounding and hallucinations. Current solutions typically isolate retrieval methods focusing either on structured data (SQL/Cypher) or unstructured semantic search but fail to integrate both simultaneously. This work presents MediGRAF (Medical Graph Retrieval Augmented Framework), a novel hybrid Graph RAG system that bridges this gap. By uniquely combining Neo4j Text2Cypher capabilities for structured relationship traversal with vector embeddings for unstructured narrative retrieval, MediGRAF enables natural language querying of the complete patient journey. Using 10 patients from the MIMIC-IV dataset (generating 5,973 nodes and 5,963 relationships), we generated enough nodes and data for patient level question answering (QA), and we evaluated this architecture across varying query complexities. The system demonstrated 100\% recall for factual queries which means all relevant information was retrieved and in the output, while complex inference tasks achieved a mean expert quality score of 4.25/5 with zero safety violations. These results demonstrate that hybrid graph-grounding significantly advances clinical information retrieval, offering a safer, more comprehensive alternative to standard LLM deployments.
Related papers
- A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z) - Patient-level Information Extraction by Consistent Integration of Textual and Tabular Evidence with Bayesian Networks [4.993537302467224]
We propose a method for multi-modal patient-level information extraction using an expert-informed Bayesian network.<n>We propose the use of virtual evidence augmented with a consistency node to provide an interpretable, probabilistic fusion of the models' predictions.<n>We show the potential of our method on the SimSUM dataset.
arXiv Detail & Related papers (2025-11-21T08:59:42Z) - OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive [50.468138755368805]
Opioid crisis represents a significant moment in public health.<n>Data and documents disclosed in the UCSF-JHU Opioid Industry Documents Archive (OIDA)<n>In this paper, we tackle this challenge by organizing the original dataset according to document attributes.
arXiv Detail & Related papers (2025-11-13T03:27:32Z) - HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways [12.855316833585908]
HealthBranches is a novel benchmark dataset for medical Question-Answering (Q&A)<n>This dataset is generated through a semi-automated pipeline that transforms explicit decision pathways from medical source into realistic patient cases with associated questions and answers.<n> Covering 4,063 case studies across 17 healthcare topics, each data point is based on clinically validated reasoning chains.
arXiv Detail & Related papers (2025-08-10T11:45:34Z) - HySemRAG: A Hybrid Semantic Retrieval-Augmented Generation Framework for Automated Literature Synthesis and Methodological Gap Analysis [55.2480439325792]
HySemRAG is a framework that combines Extract, Transform, Load (ETL) pipelines with Retrieval-Augmented Generation (RAG)<n>System addresses limitations in existing RAG architectures through a multi-layered approach.
arXiv Detail & Related papers (2025-08-01T20:30:42Z) - Give me Some Hard Questions: Synthetic Data Generation for Clinical QA [13.436187152293515]
This paper explores generating Clinical QA data using large language models (LLMs) in a zero-shot setting.<n>We find that naive prompting often results in easy questions that do not reflect the complexity of clinical scenarios.<n>Experiments on two Clinical QA datasets demonstrate that our method generates more challenging questions, significantly improving fine-tuning performance over baselines.
arXiv Detail & Related papers (2024-12-05T19:35:41Z) - Enhanced Electronic Health Records Text Summarization Using Large Language Models [0.0]
This project builds on prior work by creating a system that generates clinician-preferred, focused summaries.
The proposed system leverages the Flan-T5 model to generate tailored EHR summaries based on clinician-specified topics.
arXiv Detail & Related papers (2024-10-12T19:36:41Z) - emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information [2.2083091880368855]
The emrQA-msquad dataset was developed to address the intricacies of medical terminology.
A dedicated medical dataset for the Span extraction task was introduced, reinforcing the system's robustness.
The fine-tuning of models such as BERT, RoBERTa, and Tiny RoBERTa significantly improved response accuracy within the F1-score range of 0.75 to 1.00.
arXiv Detail & Related papers (2024-04-18T10:06:00Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment
Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment.
DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.