LLaMA-XR: A Novel Framework for Radiology Report Generation using LLaMA and QLoRA Fine Tuning
- URL: http://arxiv.org/abs/2506.03178v1
- Date: Thu, 29 May 2025 12:21:18 GMT
- Title: LLaMA-XR: A Novel Framework for Radiology Report Generation using LLaMA and QLoRA Fine Tuning
- Authors: Md. Zihad Bin Jahangir, Muhammad Ashad Kabir, Sumaiya Akter, Israt Jahan, Minh Chau,
- Abstract summary: We present LLaMA-XR, a novel framework that integrates LLaMA 3.1 with DenseNet-121-based image embeddings and Quantized Low-Rank Adaptation (QLoRA) fine-tuning.<n>LLaMA-XR achieves improved coherence and clinical accuracy while maintaining computational efficiency.
- Score: 0.807790317232093
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated radiology report generation holds significant potential to reduce radiologists' workload and enhance diagnostic accuracy. However, generating precise and clinically meaningful reports from chest radiographs remains challenging due to the complexity of medical language and the need for contextual understanding. Existing models often struggle with maintaining both accuracy and contextual relevance. In this paper, we present LLaMA-XR, a novel framework that integrates LLaMA 3.1 with DenseNet-121-based image embeddings and Quantized Low-Rank Adaptation (QLoRA) fine-tuning. LLaMA-XR achieves improved coherence and clinical accuracy while maintaining computational efficiency. This efficiency is driven by an optimization strategy that enhances parameter utilization and reduces memory overhead, enabling faster report generation with lower computational resource demands. Extensive experiments conducted on the IU X-ray benchmark dataset demonstrate that LLaMA-XR outperforms a range of state-of-the-art methods. Our model achieves a ROUGE-L score of 0.433 and a METEOR score of 0.336, establishing new performance benchmarks in the domain. These results underscore LLaMA-XR's potential as an effective and efficient AI system for automated radiology reporting, offering enhanced clinical utility and reliability.
Related papers
- Revolutionizing Radiology Workflow with Factual and Efficient CXR Report Generation [0.0]
This paper introduces CXR-PathFinder, a novel Large Language Model (LLM)-centric foundation model specifically engineered for automated chest X-ray (CXR) report generation.<n>We propose a unique training paradigm, Clinician-Guided Adrial Fine-Tuning (CGAFT), which meticulously integrates expert clinical feedback into an adversarial learning framework.<n>Our experiments demonstrate that CXR-PathFinder significantly outperforms existing state-of-the-art medical vision-language models across various quantitative metrics.
arXiv Detail & Related papers (2025-06-01T18:47:49Z) - Look & Mark: Leveraging Radiologist Eye Fixations and Bounding boxes in Multimodal Large Language Models for Chest X-ray Report Generation [2.821158017021184]
Look & Mark (L&M) is a novel grounding fixation strategy that integrates radiologist eye fixations (Look) and bounding box annotations (Mark)<n>General-purpose models also benefit from L&M combined with in-context learning, with LLaVA-OV achieving an 87.3% clinical average performance (C.AVG)-the highest among all models.
arXiv Detail & Related papers (2025-05-28T10:54:40Z) - ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification [57.22053411719822]
ChestX-Reasoner is a radiology diagnosis MLLM designed to leverage process supervision mined directly from clinical reports.<n>Our two-stage training framework combines supervised fine-tuning and reinforcement learning guided by process rewards to better align model reasoning with clinical standards.
arXiv Detail & Related papers (2025-04-29T16:48:23Z) - A Cascaded Dilated Convolution Approach for Mpox Lesion Classification [0.0]
Mpox virus presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases.<n>Deep learning-based approaches for skin lesion classification offer a promising alternative.<n>This study introduces the Cascaded Atrous Group Attention framework to address these challenges.
arXiv Detail & Related papers (2024-12-13T12:47:30Z) - LSM-YOLO: A Compact and Effective ROI Detector for Medical Detection [8.812471041082105]
We propose a novel model named Lightweight Shunt Matching-YOLO (LSM-YOLO), with Lightweight Adaptive Extraction (LAE) and Multipath Shunt Feature Matching (MSFM)
Experimental results demonstrate that LSM-YOLO achieves 48.6% AP on a private dataset of pancreatic tumors, 65.1% AP on the BCCD blood cell detection public dataset, and 73.0% AP on the Br35h brain tumor detection public dataset.
arXiv Detail & Related papers (2024-08-26T08:16:58Z) - DALL-M: Context-Aware Clinical Data Augmentation with LLMs [13.827368628263997]
We introduce DALL-M, a framework that enhances clinical datasets by generating contextual synthetic data.<n>It integrates structured patient data with contextual knowledge extracted from radiology reports and domain-specific resources.<n>Using large language models (LLMs), it generates both contextual synthetic values for existing clinical features and entirely new, clinically relevant features.
arXiv Detail & Related papers (2024-07-11T07:01:50Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Large Model driven Radiology Report Generation with Clinical Quality
Reinforcement Learning [16.849933628738277]
Radiology report generation (RRG) has attracted significant attention due to its potential to reduce the workload of radiologists.
This paper introduces a novel RRG method, textbfLM-RRG, that integrates large models (LMs) with clinical quality reinforcement learning.
Experiments on the MIMIC-CXR and IU-Xray datasets demonstrate the superiority of our method over the state of the art.
arXiv Detail & Related papers (2024-03-11T13:47:11Z) - End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding [47.360760580820966]
We present RO-LMM, a comprehensive large multimodal model (LMM) tailored for the field of radiation oncology.<n>This model effectively manages a series of tasks within the clinical workflow, including clinical context summarization, radiation treatment plan suggestion, and plan-guided target volume segmentation.<n>We present a novel Consistency Embedding Fine-Tuning (CEFTune) technique, which boosts LMM's robustness to noisy inputs while preserving the consistency of handling clean inputs.
arXiv Detail & Related papers (2023-11-27T14:49:06Z) - LLM-driven Multimodal Target Volume Contouring in Radiation Oncology [46.23891509553877]
Large language models (LLMs) can facilitate the integration of the textural information and images.
We present a novel LLM-driven multimodal AI, namely LLMSeg, that is applicable to the challenging task of target volume contouring for radiation therapy.
We demonstrate that the proposed model exhibits markedly improved performance compared to conventional unimodal AI models.
arXiv Detail & Related papers (2023-11-03T13:38:42Z) - Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis.
The hierarchical transformers in the generator are designed to estimate the noise at multiple scales.
Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.