Continual Self-supervised Learning: Towards Universal Multi-modal
Medical Data Representation Learning
- URL: http://arxiv.org/abs/2311.17597v2
- Date: Thu, 30 Nov 2023 02:06:13 GMT
- Title: Continual Self-supervised Learning: Towards Universal Multi-modal
Medical Data Representation Learning
- Authors: Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Qi Wu, Yong Xia
- Abstract summary: Self-supervised learning is an efficient pre-training method for medical image analysis.
We propose MedCoSS, a continuous self-supervised learning approach for multi-modal medical data.
We conduct continuous self-supervised pre-training on a large-scale multi-modal unlabeled dataset.
- Score: 36.33882718631217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning is an efficient pre-training method for medical
image analysis. However, current research is mostly confined to
specific-modality data pre-training, consuming considerable time and resources
without achieving universality across different modalities. A straightforward
solution is combining all modality data for joint self-supervised pre-training,
which poses practical challenges. Firstly, our experiments reveal conflicts in
representation learning as the number of modalities increases. Secondly,
multi-modal data collected in advance cannot cover all real-world scenarios. In
this paper, we reconsider versatile self-supervised learning from the
perspective of continual learning and propose MedCoSS, a continuous
self-supervised learning approach for multi-modal medical data. Unlike joint
self-supervised learning, MedCoSS assigns different modality data to different
training stages, forming a multi-stage pre-training process. To balance modal
conflicts and prevent catastrophic forgetting, we propose a rehearsal-based
continual learning method. We introduce the k-means sampling strategy to retain
data from previous modalities and rehearse it when learning new modalities.
Instead of executing the pretext task on buffer data, a feature distillation
strategy and an intra-modal mixup strategy are applied to these data for
knowledge retention. We conduct continuous self-supervised pre-training on a
large-scale multi-modal unlabeled dataset, including clinical reports, X-rays,
CT scans, MRI scans, and pathological images. Experimental results demonstrate
MedCoSS's exceptional generalization ability across nine downstream datasets
and its significant scalability in integrating new modality data. Code and
pre-trained weight are available at https://github.com/yeerwen/MedCoSS.
Related papers
- Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification [2.5091334993691206]
Development of a robust deep-learning model for retinal disease diagnosis requires a substantial dataset for training.
The capacity to generalize effectively on smaller datasets remains a persistent challenge.
We've combined a wide range of data sources to improve performance and generalization to new data.
arXiv Detail & Related papers (2024-09-17T17:22:35Z) - Towards Precision Healthcare: Robust Fusion of Time Series and Image Data [8.579651833717763]
We introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information.
We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results.
Our experiments show that our method is effective in improving multimodal deep learning for clinical applications.
arXiv Detail & Related papers (2024-05-24T11:18:13Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Learning Unseen Modality Interaction [54.23533023883659]
Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences.
We pose the problem of unseen modality interaction and introduce a first solution.
It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved.
arXiv Detail & Related papers (2023-06-22T10:53:10Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
Information [77.80071279597665]
We propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training)
Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation.
arXiv Detail & Related papers (2022-11-17T18:59:49Z) - Understanding the Tricks of Deep Learning in Medical Image Segmentation:
Challenges and Future Directions [66.40971096248946]
In this paper, we collect a series of MedISeg tricks for different model implementation phases.
We experimentally explore the effectiveness of these tricks on consistent baselines.
We also open-sourced a strong MedISeg repository, where each component has the advantage of plug-and-play.
arXiv Detail & Related papers (2022-09-21T12:30:05Z) - Decentralized Distributed Learning with Privacy-Preserving Data
Synthesis [9.276097219140073]
In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data.
Recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis.
We present a decentralized distributed method that integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy.
arXiv Detail & Related papers (2022-06-20T23:49:38Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.