HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
- URL: http://arxiv.org/abs/2412.11070v1
- Date: Sun, 15 Dec 2024 06:04:16 GMT
- Title: HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
- Authors: Tengfei Liu, Jiapu Wang, Yongli Hu, Mingjie Li, Junfei Yi, Xiaojun Chang, Junbin Gao, Baocai Yin,
- Abstract summary: We propose a novel Historical-Constrained Large Language Models (HC-LLM) framework for Radiology report generation.
Our approach extracts both time-shared and time-specific features from longitudinal chest X-rays and diagnostic reports to capture disease progression.
Notably, our approach performs well even without historical data during testing and can be easily adapted to other multimodal large models.
- Score: 89.3260120072177
- License:
- Abstract: Radiology report generation (RRG) models typically focus on individual exams, often overlooking the integration of historical visual or textual data, which is crucial for patient follow-ups. Traditional methods usually struggle with long sequence dependencies when incorporating historical information, but large language models (LLMs) excel at in-context learning, making them well-suited for analyzing longitudinal medical data. In light of this, we propose a novel Historical-Constrained Large Language Models (HC-LLM) framework for RRG, empowering LLMs with longitudinal report generation capabilities by constraining the consistency and differences between longitudinal images and their corresponding reports. Specifically, our approach extracts both time-shared and time-specific features from longitudinal chest X-rays and diagnostic reports to capture disease progression. Then, we ensure consistent representation by applying intra-modality similarity constraints and aligning various features across modalities with multimodal contrastive and structural constraints. These combined constraints effectively guide the LLMs in generating diagnostic reports that accurately reflect the progression of the disease, achieving state-of-the-art results on the Longitudinal-MIMIC dataset. Notably, our approach performs well even without historical data during testing and can be easily adapted to other multimodal large models, enhancing its versatility.
Related papers
- Unlocking Multimodal Integration in EHRs: A Prompt Learning Framework for Language and Time Series Fusion [27.70300880284899]
Large language models (LLMs) have shown remarkable performance in vision-language tasks, but their application in the medical field remains underexplored.
We introduce ProMedTS, a novel self-supervised multimodal framework that employs prompt-guided learning to unify data types.
We evaluate ProMedTS on disease diagnosis tasks using real-world datasets, and the results demonstrate that our method consistently outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-19T07:56:48Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.
Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.
Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.
Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - HERGen: Elevating Radiology Report Generation with Longitudinal Data [18.370515015160912]
We propose a novel History Enhanced Radiology Report Generation (HERGen) framework to efficiently integrate longitudinal data across patient visits.
Our approach not only allows for comprehensive analysis of varied historical data but also improves the quality of generated reports through an auxiliary contrastive objective.
The extensive evaluations across three datasets demonstrate that our framework surpasses existing methods in generating accurate radiology reports and effectively predicting disease progression from medical images.
arXiv Detail & Related papers (2024-07-21T13:29:16Z) - EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling [22.94521527609479]
EMERGE is a Retrieval-Augmented Generation driven framework aimed at enhancing multimodal EHR predictive modeling.
Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models.
The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses.
arXiv Detail & Related papers (2024-05-27T10:53:15Z) - Prompting Large Language Models for Zero-Shot Clinical Prediction with
Structured Longitudinal Electronic Health Record Data [7.815738943706123]
Large Language Models (LLMs) are traditionally tailored for natural language processing.
This research investigates the adaptability of LLMs, like GPT-4, to EHR data.
In response to the longitudinal, sparse, and knowledge-infused nature of EHR data, our prompting approach involves taking into account specific characteristics.
arXiv Detail & Related papers (2024-01-25T20:14:50Z) - C^2M-DoT: Cross-modal consistent multi-view medical report generation
with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT)
C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians.
Recent studies have achieved promising results in automatic impression generation using large-scale medical text data.
These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z) - Cross-Modal Causal Intervention for Medical Report Generation [109.83549148448469]
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance.
Due to the spurious correlations within image-text data induced by visual and linguistic biases, it is challenging to generate accurate reports reliably describing lesion areas.
We propose a novel Visual-Linguistic Causal Intervention (VLCI) framework for MRG, which consists of a visual deconfounding module (VDM) and a linguistic deconfounding module (LDM)
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis [10.133715767542386]
We propose a knowledge-driven and data-driven framework for lung disease diagnosis.
We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data.
A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
arXiv Detail & Related papers (2022-02-09T04:12:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.