Reinforced Medical Report Generation with X-Linear Attention and
Repetition Penalty
- URL: http://arxiv.org/abs/2011.07680v1
- Date: Mon, 16 Nov 2020 01:44:47 GMT
- Title: Reinforced Medical Report Generation with X-Linear Attention and
Repetition Penalty
- Authors: Wenting Xu, Chang Qi, Zhenghua Xu and Thomas Lukasiewicz
- Abstract summary: We propose a reinforced medical report generation solution with x-linear attention and repetition penalty mechanisms.
x-linear attention modules are used to explore high-order feature interactions and achieve multi-modal reasoning.
ReMRG-XR greatly outperforms the state-of-the-art baselines in terms of all metrics.
- Score: 46.51332238677608
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To reduce doctors' workload, deep-learning-based automatic medical report
generation has recently attracted more and more research efforts, where
attention mechanisms and reinforcement learning are integrated with the classic
encoder-decoder architecture to enhance the performance of deep models.
However, these state-of-the-art solutions mainly suffer from two shortcomings:
(i) their attention mechanisms cannot utilize high-order feature interactions,
and (ii) due to the use of TF-IDF-based reward functions, these methods are
fragile with generating repeated terms. Therefore, in this work, we propose a
reinforced medical report generation solution with x-linear attention and
repetition penalty mechanisms (ReMRG-XR) to overcome these problems.
Specifically, x-linear attention modules are used to explore high-order feature
interactions and achieve multi-modal reasoning, while repetition penalty is
used to apply penalties to repeated terms during the model's training process.
Extensive experimental studies have been conducted on two public datasets, and
the results show that ReMRG-XR greatly outperforms the state-of-the-art
baselines in terms of all metrics.
Related papers
- A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification [11.105845244103506]
This paper proposes the Multi-Resolution Mutual Learning Network (MRM-Net)
MRM-Net includes a dual-resolution attention architecture and a feature complementary mechanism.
It significantly outperforms existing methods in multi-label ECG classification performance.
arXiv Detail & Related papers (2024-06-12T13:40:03Z) - Bag of Tricks for Long-Tailed Multi-Label Classification on Chest X-Rays [40.11576642444264]
This report presents a brief description of our solution in the ICCV CVAMD 2023 CXR-LT Competition.
We empirically explored the effectiveness for CXR diagnosis with the integration of several advanced designs.
Our framework finally achieves 0.349 mAP on the competition test set, ranking in the top five.
arXiv Detail & Related papers (2023-08-17T08:25:55Z) - Learning Through Guidance: Knowledge Distillation for Endoscopic Image
Classification [40.366659911178964]
Endoscopy plays a major role in identifying any underlying abnormalities within the gastrointestinal (GI) tract.
Deep learning, specifically Convolution Neural Networks (CNNs) which are designed to perform automatic feature learning without any prior feature engineering, has recently reported great benefits for GI endoscopy image analysis.
We investigate three KD-based learning frameworks, response-based, feature-based, and relation-based mechanisms, and introduce a novel multi-head attention-based feature fusion mechanism to support relation-based learning.
arXiv Detail & Related papers (2023-08-17T02:02:11Z) - Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral
Fracture Grading [72.45699658852304]
This paper proposes a novel approach to train a generative Diffusion Autoencoder model as an unsupervised feature extractor.
We model fracture grading as a continuous regression, which is more reflective of the smooth progression of fractures.
Importantly, the generative nature of our method allows us to visualize different grades of a given vertebra, providing interpretability and insight into the features that contribute to automated grading.
arXiv Detail & Related papers (2023-03-21T17:16:01Z) - Cross-Modal Causal Intervention for Medical Report Generation [109.83549148448469]
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance.
Due to the spurious correlations within image-text data induced by visual and linguistic biases, it is challenging to generate accurate reports reliably describing lesion areas.
We propose a novel Visual-Linguistic Causal Intervention (VLCI) framework for MRG, which consists of a visual deconfounding module (VDM) and a linguistic deconfounding module (LDM)
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - Hybrid Reinforced Medical Report Generation with M-Linear Attention and
Repetition Penalty [45.92216112110279]
We propose a hybrid reinforced medical report generation method with m-linear attention and repetition penalty mechanism.
Specifically, a hybrid reward with different weights is employed to remedy the limitations of single-metric-based rewards.
We also propose a search algorithm with linear complexity to approximate the best weight combination.
arXiv Detail & Related papers (2022-10-14T15:27:34Z) - Factored Attention and Embedding for Unstructured-view Topic-related
Ultrasound Report Generation [70.7778938191405]
We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation.
The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
arXiv Detail & Related papers (2022-03-12T15:24:03Z) - Learning Hierarchical Attention for Weakly-supervised Chest X-Ray
Abnormality Localization and Diagnosis [28.747482895051103]
deep learning has driven much recent progress in medical imaging, but many clinical challenges are not fully addressed.
One potential way to address this problem is to further train these models to localize abnormalities in addition to just classifying them.
In this work, we take a step towards addressing these issues by means of a new attention-driven weakly supervised algorithm.
arXiv Detail & Related papers (2021-12-23T04:12:51Z) - Untangling tradeoffs between recurrence and self-attention in neural
networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks.
We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.
We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.