PriorRG: Prior-Guided Contrastive Pre-training and Coarse-to-Fine Decoding for Chest X-ray Report Generation
- URL: http://arxiv.org/abs/2508.05353v1
- Date: Thu, 07 Aug 2025 13:02:20 GMT
- Title: PriorRG: Prior-Guided Contrastive Pre-training and Coarse-to-Fine Decoding for Chest X-ray Report Generation
- Authors: Kang Liu, Zhuoqi Ma, Zikang Fang, Yunan Li, Kun Xie, Qiguang Miao,
- Abstract summary: PriorRG is a chest X-ray report generation framework that emulates real-world clinical via a two-stage training pipeline.<n>In Stage 1, we introduce a prior-guided contrast pre-training scheme that leverages clinical context guidetemporal feature extraction.<n>In Stage 2, we integrate a prior-aware coarsetemporal decoding for generation that enhances prior knowledge with the vision encoders hidden states.
- Score: 12.860257420677122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chest X-ray report generation aims to reduce radiologists' workload by automatically producing high-quality preliminary reports. A critical yet underexplored aspect of this task is the effective use of patient-specific prior knowledge -- including clinical context (e.g., symptoms, medical history) and the most recent prior image -- which radiologists routinely rely on for diagnostic reasoning. Most existing methods generate reports from single images, neglecting this essential prior information and thus failing to capture diagnostic intent or disease progression. To bridge this gap, we propose PriorRG, a novel chest X-ray report generation framework that emulates real-world clinical workflows via a two-stage training pipeline. In Stage 1, we introduce a prior-guided contrastive pre-training scheme that leverages clinical context to guide spatiotemporal feature extraction, allowing the model to align more closely with the intrinsic spatiotemporal semantics in radiology reports. In Stage 2, we present a prior-aware coarse-to-fine decoding for report generation that progressively integrates patient-specific prior knowledge with the vision encoder's hidden states. This decoding allows the model to align with diagnostic focus and track disease progression, thereby enhancing the clinical accuracy and fluency of the generated reports. Extensive experiments on MIMIC-CXR and MIMIC-ABN datasets demonstrate that PriorRG outperforms state-of-the-art methods, achieving a 3.6% BLEU-4 and 3.8% F1 score improvement on MIMIC-CXR, and a 5.9% BLEU-1 gain on MIMIC-ABN. Code and checkpoints will be released upon acceptance.
Related papers
- CLARIFID: Improving Radiology Report Generation by Reinforcing Clinically Accurate Impressions and Enforcing Detailed Findings [1.515687944002438]
We propose CLARIFID, a novel framework that directly optimize diagnostic correctness by mirroring the two-step workflow of experts.<n> CLARIFID learns the logical flow from Findings to Impression through section-aware pretraining.<n>We show that our method achieves superior clinical efficacy and outperforms existing baselines on both standard NLG metrics and clinically aware scores.
arXiv Detail & Related papers (2025-07-23T05:57:59Z) - Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation [15.257119888131609]
We propose enhanced contrastive learning with Multi-view Longitudinal data to facilitate chest X-ray Report Generation, named MLRG.<n>Specifically, we introduce a multi-view longitudinal contrast learning method that integrates spatial information from current multi-view images and temporal information from longitudinal data.<n>We present a tokenized absence encoding technique to handle missing patient-specific prior knowledge, allowing the model to produce more accurate radiology reports based on available prior knowledge.
arXiv Detail & Related papers (2025-02-27T12:59:04Z) - EVOKE: Elevating Chest X-ray Report Generation via Multi-View Contrastive Learning and Patient-Specific Knowledge [21.596462896333733]
textbfEVOKE is a novel chest X-ray report generation framework that incorporates multi-view contrastive learning and patient-specific knowledge.<n>We present a knowledge-guided report generation module that integrates available patient-specific indications.<n>Our proposed EVOKE surpasses recent state-of-the-art methods across multiple datasets.
arXiv Detail & Related papers (2024-11-15T14:38:13Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Instrumental Variable Learning for Chest X-ray Classification [52.68170685918908]
We propose an interpretable instrumental variable (IV) learning framework to eliminate the spurious association and obtain accurate causal representation.
Our approach's performance is demonstrated using the MIMIC-CXR, NIH ChestX-ray 14, and CheXpert datasets.
arXiv Detail & Related papers (2023-05-20T03:12:23Z) - Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report
Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning.
In detail, the fundamental structure of our graph is pre-constructed from general knowledge.
Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z) - Improving Radiology Report Generation Systems by Removing Hallucinated
References to Non-existent Priors [1.1110995501996481]
We propose two methods to remove references to priors in radiology reports.
A GPT-3-based few-shot approach to rewrite medical reports without references to priors; and a BioBERT-based token classification approach to directly remove words referring to priors.
We find that our re-trained model--which we call CXR-ReDonE--outperforms previous report generation methods on clinical metrics, achieving an average BERTScore of 0.2351 (2.57% absolute improvement)
arXiv Detail & Related papers (2022-09-27T00:44:41Z) - Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation [116.87918100031153]
We propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG)
CGT injects clinical relation triples into the visual features as prior knowledge to drive the decoding procedure.
Experiments on the large-scale FFA-IR benchmark demonstrate that the proposed CGT is able to outperform previous benchmark methods.
arXiv Detail & Related papers (2022-06-04T13:16:30Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.