Related papers: CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation

CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation

URL: http://arxiv.org/abs/2601.15408v1
Date: Wed, 21 Jan 2026 19:19:41 GMT
Title: CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation
Authors: Pablo Messina, Andrés Villa, Juan León Alcázar, Karen Sánchez, Carlos Hinojosa, Denis Parra, Álvaro Soto, Bernard Ghanem,
Abstract summary: CURE is an error-aware curriculum learning framework for medical vision-language models.<n>It fine-tunes a multimodal instructional model on phrase grounding, grounded report generation, and anatomy-grounded report generation.<n>CURE improves grounding accuracy by +0.37 IoU, boosts report quality by +0.188 CXRFEScore, and reduces hallucinations by 18.6%.
Score: 46.0800756149113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Medical vision-language models can automate the generation of radiology reports but struggle with accurate visual grounding and factual consistency. Existing models often misalign textual findings with visual evidence, leading to unreliable or weakly grounded predictions. We present CURE, an error-aware curriculum learning framework that improves grounding and report quality without any additional data. CURE fine-tunes a multimodal instructional model on phrase grounding, grounded report generation, and anatomy-grounded report generation using public datasets. The method dynamically adjusts sampling based on model performance, emphasizing harder samples to improve spatial and textual alignment. CURE improves grounding accuracy by +0.37 IoU, boosts report quality by +0.188 CXRFEScore, and reduces hallucinations by 18.6%. CURE is a data-efficient framework that enhances both grounding accuracy and report reliability. Code is available at https://github.com/PabloMessina/CURE and model weights at https://huggingface.co/pamessina/medgemma-4b-it-cure

Related papers

Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports [6.758140018768657]
We present a novel phrase-grounded fact-checking model (FC model) that detects errors in findings and their indicated locations in automatically generated chest radiology reports.<n>We simulate the errors in reports through a large synthetic dataset derived by perturbing findings and their locations in ground truth reports to form real and fake findings-location pairs with images.
arXiv Detail & Related papers (2025-09-20T04:33:44Z)
Leveraging Language Models for Automated Patient Record Linkage [0.5461938536945723]
This study investigates the feasibility of leveraging language models for automated patient record linkage.<n>We utilize real-world healthcare data from the Missouri Cancer Registry and Research Center.
arXiv Detail & Related papers (2025-04-21T17:41:15Z)
RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment [10.67889367763112]
RadAlign is a novel framework that combines the predictive accuracy of vision-language models with the reasoning capabilities of large language models.<n>Our framework maintains strong clinical interpretability while reducing hallucinations, advancing automated medical imaging and report analysis through integrated predictive and generative AI.
arXiv Detail & Related papers (2025-01-13T17:55:32Z)
Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation [20.173287130474797]
generative medical Vision Large Language Models (VLLMs) are prone to hallucinations and can produce inaccurate diagnostic information.<n>We introduce a novel Semantic Consistency-Based Uncertainty Quantification framework that provides both report-level and sentence-level uncertainties.<n>Our approach improves factuality scores by $10$%, achieved by rejecting $20$% of reports using the textttRadialog model on the MIMIC-CXR dataset.
arXiv Detail & Related papers (2024-12-05T20:43:39Z)
Text2Data: Low-Resource Data Generation with Textual Control [100.5970757736845]
Text2Data is a novel approach that utilizes unlabeled data to understand the underlying data distribution.<n>It undergoes finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z)
Automatic Radiology Report Generation by Learning with Increasingly Hard Negatives [23.670280341513795]
This paper proposes a novel framework to learn discriminative image and report features. It distinguishes them from their closest peers, i.e., hard negatives. It can serve as a plug-in to readily improve existing medical report generation models.
arXiv Detail & Related papers (2023-05-11T23:12:13Z)
Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning. In detail, the fundamental structure of our graph is pre-constructed from general knowledge. Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z)
mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries. We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.