Related papers: LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation

LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2503.13281v3
Date: Mon, 24 Mar 2025 19:32:25 GMT
Title: LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation
Authors: Xiaodi Li, Shaika Chowdhury, Chung Il Wi, Maria Vassilaki, Xiaoke Liu, Terence T Sio, Owen Garrick, Young J Juhn, James R Cerhan, Cui Tao, Nansu Zong,
Abstract summary: Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria.<n>We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models.<n>We evaluated it on four open datasets - n2c2, SIGIR, TREC 2021, and TREC 2022 - using open-source models, comparing it against TrialGPT, Zero-Shot, and GPT-4-based closed models.
Score: 6.4073053466465835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria. We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models. Our approach consists of four key components. First, a retrieval-augmented generation (RAG) module extracts relevant patient context from a vast pool of electronic health records (EHRs). Second, a prompt generation module constructs input prompts by integrating trial eligibility criteria (both inclusion and exclusion criteria), patient context, and system instructions. Third, a fine-tuning module with a classification head optimizes the model parameters using structured prompts and ground-truth labels. Fourth, an evaluation module assesses the fine-tuned model's performance on the testing datasets. We evaluated LLM-Match on four open datasets - n2c2, SIGIR, TREC 2021, and TREC 2022 - using open-source models, comparing it against TrialGPT, Zero-Shot, and GPT-4-based closed models. LLM-Match outperformed all baselines.

Related papers

Leveraging Language Models for Automated Patient Record Linkage [0.5461938536945723]
This study investigates the feasibility of leveraging language models for automated patient record linkage. We utilize real-world healthcare data from the Missouri Cancer Registry and Research Center.
arXiv Detail & Related papers (2025-04-21T17:41:15Z)
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs [1.8524621910043437]
Large Language Models (LLMs) enable semantic question answering (QA) over medical data.<n> ensuring privacy and compliance requires edge and private deployments of LLMs.<n>We evaluate privately hosted, fine-tuned LLMs against benchmark models such as GPT-4 and GPT-4o.
arXiv Detail & Related papers (2025-01-23T14:13:56Z)
LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts [14.72366043711941]
Current radiology report generation models are constrained to a fixed task paradigm.<n>We propose a novel large language model (LLM) based RRG framework, namely LLM-RG4.<n>We show that our model has minimal input-agnostic hallucinations, whereas current open-source models commonly suffer from this problem.
arXiv Detail & Related papers (2024-12-16T17:29:51Z)
CLINICSUM: Utilizing Language Models for Generating Clinical Summaries from Patient-Doctor Conversations [2.77462589810782]
ClinicSum is a framework designed to automatically generate clinical summaries from patient-doctor conversations.<n>It is evaluated through both automatic metrics (e.g., ROUGE, BERTScore) and expert human assessments.
arXiv Detail & Related papers (2024-12-05T15:34:02Z)
Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring. Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations. Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z)
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline.<n>We benchmark existing scaling techniques, especially selective merging, and variants of mixture.<n>We then formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo.<n>Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters.
arXiv Detail & Related papers (2024-10-07T15:55:55Z)
Development and Validation of a Dynamic-Template-Constrained Large Language Model for Generating Fully-Structured Radiology Reports [9.504087246178221]
Current LLMs for creating fully-structured reports face the challenges of formatting errors, content hallucinations, and privacy leakage issues when uploading data to external servers. We aim to develop an open-source, accurate LLM for creating fully-structured and standardized LCS reports from varying free-text reports across institutions.
arXiv Detail & Related papers (2024-09-26T21:59:11Z)
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA) We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity. We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z)
Distilling Large Language Models for Matching Patients to Clinical Trials [3.4068841624198942]
The recent success of large language models (LLMs) has paved the way for their adoption in the high-stakes domain of healthcare. This study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching. Our findings reveal that open-source LLMs, when fine-tuned on this limited and synthetic dataset, demonstrate performance parity with their proprietary counterparts.
arXiv Detail & Related papers (2023-12-15T17:11:07Z)
ExtractGPT: Exploring the Potential of Large Language Models for Product Attribute Value Extraction [52.14681890859275]
E-commerce platforms require structured product data in the form of attribute-value pairs. BERT-based extraction methods require large amounts of task-specific training data. This paper explores using large language models (LLMs) as a more training-data efficient and robust alternative.
arXiv Detail & Related papers (2023-10-19T07:39:00Z)
An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians. Recent studies have achieved promising results in automatic impression generation using large-scale medical text data. These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment. DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.