Related papers: Overcoming Dimensional Collapse in Self-supervised Contrastive Learning for Medical Image Segmentation

Overcoming Dimensional Collapse in Self-supervised Contrastive Learning for Medical Image Segmentation

URL: http://arxiv.org/abs/2402.14611v2
Date: Tue, 27 Feb 2024 16:39:35 GMT
Title: Overcoming Dimensional Collapse in Self-supervised Contrastive Learning for Medical Image Segmentation
Authors: Jamshid Hassanpour, Vinkle Srivastav, Didier Mutter, Nicolas Padoy
Abstract summary: We investigate the application of contrastive learning to the domain of medical image analysis. Our findings reveal that MoCo v2, a state-of-the-art contrastive learning method, encounters dimensional collapse when applied to medical images. To address this, we propose two key contributions: local feature learning and feature decorrelation.
Score: 2.6764957223405657
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Self-supervised learning (SSL) approaches have achieved great success when the amount of labeled data is limited. Within SSL, models learn robust feature representations by solving pretext tasks. One such pretext task is contrastive learning, which involves forming pairs of similar and dissimilar input samples, guiding the model to distinguish between them. In this work, we investigate the application of contrastive learning to the domain of medical image analysis. Our findings reveal that MoCo v2, a state-of-the-art contrastive learning method, encounters dimensional collapse when applied to medical images. This is attributed to the high degree of inter-image similarity shared between the medical images. To address this, we propose two key contributions: local feature learning and feature decorrelation. Local feature learning improves the ability of the model to focus on the local regions of the image, while feature decorrelation removes the linear dependence among the features. Our experimental findings demonstrate that our contributions significantly enhance the model's performance in the downstream task of medical segmentation, both in the linear evaluation and full fine-tuning settings. This work illustrates the importance of effectively adapting SSL techniques to the characteristics of medical imaging tasks. The source code will be made publicly available at: https://github.com/CAMMA-public/med-moco

Related papers

LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model [55.80651780294357]
State-of-the-art medical multi-modal large language models (med-MLLM) leverage instruction-following data in pre-training. LoGra-Med is a new multi-graph alignment algorithm that enforces triplet correlations across image modalities, conversation-based descriptions, and extended captions. Our results show LoGra-Med matches LLAVA-Med performance on 600K image-text pairs for Medical VQA and significantly outperforms it when trained on 10% of the data.
arXiv Detail & Related papers (2024-10-03T15:52:03Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Causality-Driven One-Shot Learning for Prostate Cancer Grading from MRI [1.049712834719005]
We present a novel method to automatically classify medical images that learns and leverages weak causal signals in the image. Our framework consists of a convolutional neural network backbone and a causality-extractor module. Our findings show that causal relationships among features play a crucial role in enhancing the model's ability to discern relevant information.
arXiv Detail & Related papers (2023-09-19T16:08:33Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Graph Self-Supervised Learning for Endoscopic Image Matching [1.8275108630751844]
We propose a novel self-supervised approach that combines Convolutional Neural Networks for capturing local visual appearance and attention-based Graph Neural Networks for modeling spatial relationships between key-points. Our approach is trained in a fully self-supervised scheme without the need for labeled data. Our approach outperforms state-of-the-art handcrafted and deep learning-based methods, demonstrating exceptional performance in terms of precision rate (1) and matching score (99.3%)
arXiv Detail & Related papers (2023-06-19T19:53:41Z)
Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image Analysis [6.87667643104543]
Self-supervised learning (SSL) has recently achieved promising performance for 3D medical image analysis tasks. Most current methods follow existing SSL paradigm originally designed for photographic or natural images. We propose a new self-supervised learning framework, namely Alice, that explicitly fulfills Anatomical invariance modeling and semantic alignment.
arXiv Detail & Related papers (2023-02-11T06:36:20Z)
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities. We explicitly account for prior images and reports when available during both training and fine-tuning. Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z)
GraVIS: Grouping Augmented Views from Independent Sources for Dermatology Analysis [52.04899592688968]
We propose GraVIS, which is specifically optimized for learning self-supervised features from dermatology images. GraVIS significantly outperforms its transfer learning and self-supervised learning counterparts in both lesion segmentation and disease classification tasks.
arXiv Detail & Related papers (2023-01-11T11:38:37Z)
Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA) First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z)
Positional Contrastive Learning for Volumetric Medical Image Segmentation [13.086140606803408]
We propose a novel positional contrastive learning framework to generate contrastive data pairs. The proposed PCL method can substantially improve the segmentation performance compared to existing methods in both semi-supervised setting and transfer learning setting.
arXiv Detail & Related papers (2021-06-16T22:15:28Z)
Few-shot Medical Image Segmentation using a Global Correlation Network with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation. We construct our few-shot image segmentor using a deep convolutional network trained episodically. We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
Improving Calibration and Out-of-Distribution Detection in Medical Image Segmentation with Convolutional Neural Networks [8.219843232619551]
Convolutional Neural Networks (CNNs) have shown to be powerful medical image segmentation models. We advocate for multi-task learning, i.e., training a single model on several different datasets. We show that not only a single CNN learns to automatically recognize the context and accurately segment the organ of interest in each context, but also that such a joint model often has more accurate and better-calibrated predictions.
arXiv Detail & Related papers (2020-04-12T23:42:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.