Advancing Radiograph Representation Learning with Masked Record Modeling
- URL: http://arxiv.org/abs/2301.13155v1
- Date: Mon, 30 Jan 2023 18:33:32 GMT
- Title: Advancing Radiograph Representation Learning with Masked Record Modeling
- Authors: Hong-Yu Zhou, Chenyu Lian, Liansheng Wang, Yizhou Yu
- Abstract summary: We formulate the self- and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM)
MRM reconstructs masked image patches and masked report tokens following a multi-task scheme to learn knowledge-enhanced semantic representations.
Specifically, we find that MRM offers superior performance in label-efficient fine-tuning.
- Score: 52.04899592688968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern studies in radiograph representation learning rely on either
self-supervision to encode invariant semantics or associated radiology reports
to incorporate medical expertise, while the complementarity between them is
barely noticed. To explore this, we formulate the self- and report-completion
as two complementary objectives and present a unified framework based on masked
record modeling (MRM). In practice, MRM reconstructs masked image patches and
masked report tokens following a multi-task scheme to learn knowledge-enhanced
semantic representations. With MRM pre-training, we obtain pre-trained models
that can be well transferred to various radiography tasks. Specifically, we
find that MRM offers superior performance in label-efficient fine-tuning. For
instance, MRM achieves 88.5% mean AUC on CheXpert using 1% labeled data,
outperforming previous R$^2$L methods with 100% labels. On NIH ChestX-ray, MRM
outperforms the best performing counterpart by about 3% under small labeling
ratios. Besides, MRM surpasses self- and report-supervised pre-training in
identifying the pneumonia type and the pneumothorax area, sometimes by large
margins.
Related papers
- Multi-modal Masked Siamese Network Improves Chest X-Ray Representation Learning [46.674521557701816]
We propose to incorporate EHR data during self-supervised pretraining with a Masked Siamese Network (MSN) to enhance the quality of chest X-ray representations.
Our work highlights the potential of EHR-enhanced self-supervised pre-training for medical imaging.
arXiv Detail & Related papers (2024-07-05T12:04:12Z) - MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis [1.2903829793534272]
Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions.
Efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records.
This paper introduces MedPromptX, the first model to integrate multimodal large language models (MLLMs), few-shot prompting (FP) and visual grounding (VG)
Results demonstrate the SOTA performance of MedPromptX, achieving an 11% improvement in F1-score compared to the baselines.
arXiv Detail & Related papers (2024-03-22T19:19:51Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis [6.712251433139412]
Pretraining vision transformers (ViT) with attention guided masked image modeling (MIM) has shown to increase downstream accuracy for natural image analysis.
We developed a co-distilled Swin transformer that combines a noisy momentum updated teacher to guide selective masking for MIM.
arXiv Detail & Related papers (2023-10-02T13:53:55Z) - Learning to Generalize towards Unseen Domains via a Content-Aware Style
Invariant Model for Disease Detection from Chest X-rays [2.2835858158799405]
Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging.
Recent studies have demonstrated that CNNs are biased toward styles rather than content.
We employ the novel on-the-fly style randomization modules at both image (SRM-IL) and feature (SRM-FL) levels to create rich style perturbed features.
arXiv Detail & Related papers (2023-02-27T17:30:00Z) - From Cloze to Comprehension: Retrofitting Pre-trained Masked Language
Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.
To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data.
PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z) - RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful
Representation from X-Ray Images [38.65823547986758]
We present a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representation from X-ray images.
When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy.
arXiv Detail & Related papers (2022-11-01T07:41:03Z) - Hierarchies of Reward Machines [75.55324974788475]
Reward machines (RMs) are a recent formalism for representing the reward function of a reinforcement learning task through a finite-state machine.
We propose a formalism for further abstracting the subtask structure by endowing an RM with the ability to call other RMs.
arXiv Detail & Related papers (2022-05-31T12:39:24Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.