Understanding Transfer Learning for Chest Radiograph Clinical Report
Generation with Modified Transformer Architectures
- URL: http://arxiv.org/abs/2205.02841v1
- Date: Thu, 5 May 2022 03:08:05 GMT
- Title: Understanding Transfer Learning for Chest Radiograph Clinical Report
Generation with Modified Transformer Architectures
- Authors: Edward Vendrow, Ethan Schonfeld
- Abstract summary: We train a series of modified transformers to generate clinical reports from chest radiograph image input.
We use BLEU(1-4), ROUGE-L, CIDEr, and the clinical CheXbert F1 scores to validate our models and demonstrate competitive scores with state of the art models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The image captioning task is increasingly prevalent in artificial
intelligence applications for medicine. One important application is clinical
report generation from chest radiographs. The clinical writing of unstructured
reports is time consuming and error-prone. An automated system would improve
standardization, error reduction, time consumption, and medical accessibility.
In this paper we demonstrate the importance of domain specific pre-training and
propose a modified transformer architecture for the medical image captioning
task. To accomplish this, we train a series of modified transformers to
generate clinical reports from chest radiograph image input. These modified
transformers include: a meshed-memory augmented transformer architecture with
visual extractor using ImageNet pre-trained weights, a meshed-memory augmented
transformer architecture with visual extractor using CheXpert pre-trained
weights, and a meshed-memory augmented transformer whose encoder is passed the
concatenated embeddings using both ImageNet pre-trained weights and CheXpert
pre-trained weights. We use BLEU(1-4), ROUGE-L, CIDEr, and the clinical
CheXbert F1 scores to validate our models and demonstrate competitive scores
with state of the art models. We provide evidence that ImageNet pre-training is
ill-suited for the medical image captioning task, especially for less frequent
conditions (eg: enlarged cardiomediastinum, lung lesion, pneumothorax).
Furthermore, we demonstrate that the double feature model improves performance
for specific medical conditions (edema, consolidation, pneumothorax, support
devices) and overall CheXbert F1 score, and should be further developed in
future work. Such a double feature model, including both ImageNet pre-training
as well as domain specific pre-training, could be used in a wide range of image
captioning models in medicine.
Related papers
- Automatic Report Generation for Histopathology images using pre-trained
Vision Transformers [1.2781698000674653]
We show that using an existing pre-trained Vision Transformer in a two-step process of first using it to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and then using it as the encoder and an LSTM decoder for report generation.
We are also able to use representations from an existing powerful pre-trained hierarchical vision transformer and show its usefulness in not just zero shot classification but also for report generation.
arXiv Detail & Related papers (2023-11-10T16:48:24Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Customizing General-Purpose Foundation Models for Medical Report
Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks.
We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z) - A New Perspective to Boost Vision Transformer for Medical Image
Classification [33.215289791017064]
We propose a self-supervised learning approach specifically for medical image classification with the Transformer backbone.
Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning.
The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.
arXiv Detail & Related papers (2023-01-03T07:45:59Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Self Pre-training with Masked Autoencoders for Medical Image
Classification and Segmentation [37.25161294917211]
Masked Autoencoder (MAE) has been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis.
We investigate a self pre-training paradigm with MAE for medical image analysis tasks.
arXiv Detail & Related papers (2022-03-10T16:22:38Z) - Class-Aware Generative Adversarial Transformers for Medical Image
Segmentation [39.14169989603906]
We present CA-GANformer, a novel type of generative adversarial transformers, for medical image segmentation.
First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations.
We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures.
arXiv Detail & Related papers (2022-01-26T03:50:02Z) - Pre-training and Fine-tuning Transformers for fMRI Prediction Tasks [69.85819388753579]
TFF employs a transformer-based architecture and a two-phase training approach.
Self-supervised training is applied to a collection of fMRI scans, where the model is trained for the reconstruction of 3D volume data.
Results show state-of-the-art performance on a variety of fMRI tasks, including age and gender prediction, as well as schizophrenia recognition.
arXiv Detail & Related papers (2021-12-10T18:04:26Z) - Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays [10.398175542736285]
We introduce an image-text pre-training framework that can learn from mixed data inputs.
We demonstrate the feasibility of pre-training across mixed data inputs.
We also illustrate the benefits of adopting such pre-trained models in 3 chest X-ray applications.
arXiv Detail & Related papers (2021-03-30T01:48:46Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.