Variational Topic Inference for Chest X-Ray Report Generation
- URL: http://arxiv.org/abs/2107.07314v1
- Date: Thu, 15 Jul 2021 13:34:38 GMT
- Title: Variational Topic Inference for Chest X-Ray Report Generation
- Authors: Ivona Najdenkoska, Xiantong Zhen, Marcel Worring and Ling Shao
- Abstract summary: Report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice.
Recent work has shown that deep learning models can successfully caption natural images.
We propose variational topic inference for automatic report generation.
- Score: 102.04931207504173
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automating report generation for medical imaging promises to reduce workload
and assist diagnosis in clinical practice. Recent work has shown that deep
learning models can successfully caption natural images. However, learning from
medical data is challenging due to the diversity and uncertainty inherent in
the reports written by different radiologists with discrepant expertise and
experience. To tackle these challenges, we propose variational topic inference
for automatic report generation. Specifically, we introduce a set of topics as
latent variables to guide sentence generation by aligning image and language
modalities in a latent space. The topics are inferred in a conditional
variational inference framework, with each topic governing the generation of a
sentence in the report. Further, we adopt a visual attention module that
enables the model to attend to different locations in the image and generate
more informative descriptions. We conduct extensive experiments on two
benchmarks, namely Indiana U. Chest X-rays and MIMIC-CXR. The results
demonstrate that our proposed variational topic inference method can generate
novel reports rather than mere copies of reports used in training, while still
achieving comparable performance to state-of-the-art methods in terms of
standard language generation criteria.
Related papers
- Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [83.30609465252441]
We propose a textbfCountertextbfFactual textbfExplanations-based framework (CoFE) for radiology report generation.
Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking what if'' scenarios.
Experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports.
arXiv Detail & Related papers (2024-07-19T17:24:25Z) - Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis [53.809054774037214]
This paper proposes leveraging vision-language pretraining on bone X-rays paired with French reports.
It is the first study to integrate French reports to shape the embedding space devoted to bone X-Rays representations.
arXiv Detail & Related papers (2024-05-14T19:53:20Z) - Dynamic Traceback Learning for Medical Report Generation [12.746275623663289]
This study proposes a novel multi-modal dynamic traceback learning framework (DTrace) for medical report generation.
We introduce a traceback mechanism to supervise the semantic validity of generated content and a dynamic learning strategy to adapt to various proportions of image and text input.
The proposed DTrace framework outperforms state-of-the-art methods for medical report generation.
arXiv Detail & Related papers (2024-01-24T07:13:06Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Lesion Guided Explainable Few Weak-shot Medical Report Generation [25.15493013683396]
We propose a lesion guided explainable few weak-shot medical report generation framework.
It learns correlation between seen and novel classes through visual and semantic feature alignment.
It aims to generate medical reports for diseases not observed in training.
arXiv Detail & Related papers (2022-11-16T07:47:29Z) - A Medical Semantic-Assisted Transformer for Radiographic Report
Generation [39.99216295697047]
We propose a memory-augmented sparse attention block to capture the higher-order interactions between the input fine-grained image features.
We also introduce a novel Medical Concepts Generation Network (MCGN) to predict fine-grained semantic concepts and incorporate them into the report generation process as guidance.
arXiv Detail & Related papers (2022-08-22T14:38:19Z) - Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation [3.3978173451092437]
Radiology report generation aims at generating descriptive text from radiology images automatically.
A typical setting consists of training encoder-decoder models on image-report pairs with a cross entropy loss.
We propose a novel weakly supervised contrastive loss for medical report generation.
arXiv Detail & Related papers (2021-09-25T00:06:23Z) - Unifying Relational Sentence Generation and Retrieval for Medical Image
Report Composition [142.42920413017163]
Current methods often generate the most common sentences due to dataset bias for individual case.
We propose a novel framework that unifies template retrieval and sentence generation to handle both common and rare abnormality.
arXiv Detail & Related papers (2021-01-09T04:33:27Z) - Learning Visual-Semantic Embeddings for Reporting Abnormal Findings on
Chest X-rays [6.686095511538683]
This work focuses on reporting abnormal findings on radiology images.
We propose a method to identify abnormal findings from the reports in addition to grouping them with unsupervised clustering and minimal rules.
We demonstrate that our method is able to retrieve abnormal findings and outperforms existing generation models on both clinical correctness and text generation metrics.
arXiv Detail & Related papers (2020-10-06T04:18:18Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.