RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful
Representation from X-Ray Images
- URL: http://arxiv.org/abs/2211.00313v4
- Date: Sun, 21 May 2023 14:36:59 GMT
- Title: RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful
Representation from X-Ray Images
- Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
- Abstract summary: We present a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representation from X-ray images.
When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy.
- Score: 38.65823547986758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Purpose: Self-supervised learning has been gaining attention in the medical
field for its potential to improve computer-aided diagnosis. One popular method
of self-supervised learning is masked image modeling (MIM), which involves
masking a subset of input pixels and predicting the masked pixels. However,
traditional MIM methods typically use a random masking strategy, which may not
be ideal for medical images that often have a small region of interest for
disease detection. To address this issue, this work aims to improve MIM for
medical images and evaluate its effectiveness in an open X-ray image dataset.
Methods: In this paper, we present a novel method called region-guided masked
image modeling (RGMIM) for learning meaningful representation from X-ray
images. Our method adopts a new masking strategy that utilizes organ mask
information to identify valid regions for learning more meaningful
representations. The proposed method was contrasted with five self-supervised
learning techniques (MAE, SKD, Cross, BYOL, and, SimSiam). We conduct
quantitative evaluations on an open lung X-ray image dataset as well as masking
ratio hyperparameter studies. Results: When using the entire training set,
RGMIM outperformed other comparable methods, achieving a 0.962 lung disease
detection accuracy. Specifically, RGMIM significantly improved performance in
small data volumes, such as 5% and 10% of the training set (846 and 1,693
images) compared to other methods, and achieved a 0.957 detection accuracy even
when only 50% of the training set was used. Conclusions: RGMIM can mask more
valid regions, facilitating the learning of discriminative representations and
the subsequent high-accuracy lung disease detection. RGMIM outperforms other
state-of-the-art self-supervised learning methods in experiments, particularly
when limited training data is used.
Related papers
- AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking [5.844539603252746]
Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations.
We propose AnatoMask, a novel MIM method that leverages reconstruction loss to dynamically identify and mask out anatomically significant regions.
arXiv Detail & Related papers (2024-07-09T00:15:52Z) - MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder [26.830574964308962]
We introduce MedFLIP, a Fast Language-Image Pre-training method for Medical analysis.
We explore MAEs for zero-shot learning with crossed domains, which enhances the model's ability to learn from limited data.
Lastly, we validate using language will improve the zero-shot performance for the medical image analysis.
arXiv Detail & Related papers (2024-03-07T16:11:43Z) - MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep
Models for X-ray Images of Multiple Body Parts [63.30352394004674]
Multi-task Self-super-vised Continual Learning (MUSCLE) is a novel self-supervised pre-training pipeline for medical imaging tasks.
MUSCLE aggregates X-rays collected from multiple body parts for representation learning, and adopts a well-designed continual learning procedure.
We evaluate MUSCLE using 9 real-world X-ray datasets with various tasks, including pneumonia classification, skeletal abnormality classification, lung segmentation, and tuberculosis (TB) detection.
arXiv Detail & Related papers (2023-10-03T12:19:19Z) - DINO-CXR: A self supervised method based on vision transformer for chest
X-ray classification [0.9883261192383611]
We propose a self-supervised method, DINO-CXR, which is a novel adaptation of a self-supervised method, DINO, based on a vision transformer for chest X-ray classification.
A comparative analysis is performed to show the effectiveness of the proposed method for both pneumonia and COVID-19 detection.
arXiv Detail & Related papers (2023-08-01T11:58:49Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - COVID-19 Detection Based on Self-Supervised Transfer Learning Using
Chest X-Ray Images [38.65823547986758]
We propose a new learning scheme called self-supervised transfer learning for detecting COVID-19 from chest X-ray (CXR) images.
We provide quantitative evaluation on the largest open COVID-19 CXR dataset and qualitative results for visual inspection.
arXiv Detail & Related papers (2022-12-19T07:10:51Z) - Self-Supervised-RCNN for Medical Image Segmentation with Limited Data
Annotation [0.16490701092527607]
We propose an alternative deep learning training strategy based on self-supervised pretraining on unlabeled MRI scans.
Our pretraining approach first, randomly applies different distortions to random areas of unlabeled images and then predicts the type of distortions and loss of information.
The effectiveness of the proposed method for segmentation tasks in different pre-training and fine-tuning scenarios is evaluated.
arXiv Detail & Related papers (2022-07-17T13:28:52Z) - Intelligent Masking: Deep Q-Learning for Context Encoding in Medical
Image Analysis [48.02011627390706]
We develop a novel self-supervised approach that occludes targeted regions to improve the pre-training procedure.
We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
arXiv Detail & Related papers (2022-03-25T19:05:06Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.