Automatic Radiology Report Generation by Learning with Increasingly Hard
Negatives
- URL: http://arxiv.org/abs/2305.07176v2
- Date: Mon, 7 Aug 2023 10:09:21 GMT
- Title: Automatic Radiology Report Generation by Learning with Increasingly Hard
Negatives
- Authors: Bhanu Prakash Voutharoja and Lei Wang and Luping Zhou
- Abstract summary: This paper proposes a novel framework to learn discriminative image and report features.
It distinguishes them from their closest peers, i.e., hard negatives.
It can serve as a plug-in to readily improve existing medical report generation models.
- Score: 23.670280341513795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic radiology report generation is challenging as medical images or
reports are usually similar to each other due to the common content of anatomy.
This makes a model hard to capture the uniqueness of individual images and is
prone to producing undesired generic or mismatched reports. This situation
calls for learning more discriminative features that could capture even
fine-grained mismatches between images and reports. To achieve this, this paper
proposes a novel framework to learn discriminative image and report features by
distinguishing them from their closest peers, i.e., hard negatives. Especially,
to attain more discriminative features, we gradually raise the difficulty of
such a learning task by creating increasingly hard negative reports for each
image in the feature space during training, respectively. By treating the
increasingly hard negatives as auxiliary variables, we formulate this process
as a min-max alternating optimisation problem. At each iteration, conditioned
on a given set of hard negative reports, image and report features are learned
as usual by minimising the loss functions related to report generation. After
that, a new set of harder negative reports will be created by maximising a loss
reflecting image-report alignment. By solving this optimisation, we attain a
model that can generate more specific and accurate reports. It is noteworthy
that our framework enhances discriminative feature learning without introducing
extra network weights. Also, in contrast to the existing way of generating hard
negatives, our framework extends beyond the granularity of the dataset by
generating harder samples out of the training set. Experimental study on
benchmark datasets verifies the efficacy of our framework and shows that it can
serve as a plug-in to readily improve existing medical report generation
models.
Related papers
- Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [83.30609465252441]
We propose a textbfCountertextbfFactual textbfExplanations-based framework (CoFE) for radiology report generation.
Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking what if'' scenarios.
Experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports.
arXiv Detail & Related papers (2024-07-19T17:24:25Z) - MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks [11.190146577567548]
We propose a novel model that leverages the available information in two distinct datasets.
Our model, named MedRAT, surpasses previous state-of-the-art methods.
arXiv Detail & Related papers (2024-07-04T13:31:47Z) - MAIRA-2: Grounded Radiology Report Generation [39.7576903743788]
Radiology reporting is a complex task that requires detailed image understanding, integration of multiple inputs, and precise language generation.
Here, we extend report generation to include the localisation of individual findings on the image - a task we call grounded report generation.
We introduce MAIRA-2, a large multimodal model combining a radiology-specific image encoder with a LLM, and trained for the new task of grounded report generation on chest X-rays.
arXiv Detail & Related papers (2024-06-06T19:12:41Z) - MedCycle: Unpaired Medical Report Generation via Cycle-Consistency [11.190146577567548]
We introduce an innovative approach that eliminates the need for consistent labeling schemas.
This approach is based on cycle-consistent mapping functions that transform image embeddings into report embeddings.
It outperforms state-of-the-art results in unpaired chest X-ray report generation, demonstrating improvements in both language and clinical metrics.
arXiv Detail & Related papers (2024-03-20T09:40:11Z) - Medical Report Generation based on Segment-Enhanced Contrastive
Representation Learning [39.17345313432545]
We propose MSCL (Medical image with Contrastive Learning) to segment organs, abnormalities, bones, etc.
We introduce a supervised contrastive loss that assigns more weight to reports that are semantically similar to the target while training.
Experimental results demonstrate the effectiveness of our proposed model, where we achieve state-of-the-art performance on the IU X-Ray public dataset.
arXiv Detail & Related papers (2023-12-26T03:33:48Z) - Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report
Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning.
In detail, the fundamental structure of our graph is pre-constructed from general knowledge.
Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Lesion Guided Explainable Few Weak-shot Medical Report Generation [25.15493013683396]
We propose a lesion guided explainable few weak-shot medical report generation framework.
It learns correlation between seen and novel classes through visual and semantic feature alignment.
It aims to generate medical reports for diseases not observed in training.
arXiv Detail & Related papers (2022-11-16T07:47:29Z) - Variational Topic Inference for Chest X-Ray Report Generation [102.04931207504173]
Report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice.
Recent work has shown that deep learning models can successfully caption natural images.
We propose variational topic inference for automatic report generation.
arXiv Detail & Related papers (2021-07-15T13:34:38Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.