Related papers: Multi-level Asymmetric Contrastive Learning for Volumetric Medical Image Segmentation Pre-training

Multi-level Asymmetric Contrastive Learning for Volumetric Medical Image Segmentation Pre-training

URL: http://arxiv.org/abs/2309.11876v2
Date: Mon, 13 May 2024 07:35:42 GMT
Title: Multi-level Asymmetric Contrastive Learning for Volumetric Medical Image Segmentation Pre-training
Authors: Shuang Zeng, Lei Zhu, Xinliang Zhang, Qian Chen, Hangzhou He, Lujia Jin, Zifeng Tian, Qiushi Ren, Zhaoheng Xie, Yanye Lu,
Abstract summary: We propose a novel contrastive learning framework named MACL for volumetric medical image segmentation pre-training. Experiments on 12 medical image datasets indicate our MACL framework outperforms existing 11 contrastive learning strategies.
Score: 18.01020160596681
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Medical image segmentation is a fundamental yet challenging task due to the arduous process of acquiring large volumes of high-quality labeled data from experts. Contrastive learning offers a promising but still problematic solution to this dilemma. Because existing medical contrastive learning strategies focus on extracting image-level representation, which ignores abundant multi-level representations. And they underutilize the decoder either by random initialization or separate pre-training from the encoder, thereby neglecting the potential collaboration between the encoder and decoder. To address these issues, we propose a novel multi-level asymmetric contrastive learning framework named MACL for volumetric medical image segmentation pre-training. Specifically, we design an asymmetric contrastive learning structure to pre-train encoder and decoder simultaneously to provide better initialization for segmentation models. Moreover, we develop a multi-level contrastive learning strategy that integrates correspondences across feature-level, image-level, and pixel-level representations to ensure the encoder and decoder capture comprehensive details from representations of varying scales and granularities during the pre-training phase. Finally, experiments on 12 volumetric medical image datasets indicate our MACL framework outperforms existing 11 contrastive learning strategies. {\itshape i.e.} Our MACL achieves a superior performance with more precise predictions from visualization figures and 2.28\%, 1.32\%, 1.62\% and 1.60\% Average Dice higher than previous best results on CHD, MMWHS, CHAOS and AMOS, respectively. And our MACL also has a strong generalization ability among 5 variant U-Net backbones. Our code will be available at https://github.com/stevezs315/MACL.

Related papers

pyMEAL: A Multi-Encoder Augmentation-Aware Learning for Robust and Generalizable Medical Image Translation [0.0]
3D medical imaging still suffers from data scarcity and inconsistencies due to acquisition protocols, scanner differences, and patient motion.<n>Traditional augmentation uses a single pipeline for all transformations, disregarding the unique traits of each augmentation.<n>We propose a Multi-encoder Augmentation-Aware Learning framework that leverages four distinct augmentation variants processed through dedicated encoders.
arXiv Detail & Related papers (2025-05-30T10:01:23Z)
SuperCL: Superpixel Guided Contrastive Learning for Medical Image Segmentation Pre-training [17.920724846400585]
We propose a novel contrastive learning approach named SuperCL for medical image segmentation pre-training. Our SuperCL exploits the structural prior and pixel correlation of images by introducing two novel contrastive pairs generation strategies. Experiments on 8 medical image datasets indicate our SuperCL outperforms existing 12 methods.
arXiv Detail & Related papers (2025-04-20T20:57:03Z)
Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentation [21.183229457060634]
We pre-train Hi-End-MAE on a large-scale dataset of 10K CT scans and evaluate its performance across seven public medical image segmentation benchmarks. Hi-End-MAE achieves superior transfer learning capabilities across various downstream tasks, revealing the potential of ViT in medical imaging applications.
arXiv Detail & Related papers (2025-02-12T12:14:02Z)
Large Language Models for Multimodal Deformable Image Registration [50.91473745610945]
We propose a novel coarse-to-fine MDIR framework,LLM-Morph, for aligning the deep features from different modal medical images. Specifically, we first utilize a CNN encoder to extract deep visual features from cross-modal image pairs, then we use the first adapter to adjust these tokens, and use LoRA in pre-trained LLMs to fine-tune their weights. Third, for the alignment of tokens, we utilize other four adapters to transform the LLM-encoded tokens into multi-scale visual features, generating multi-scale deformation fields and facilitating the coarse-to-fine MDIR task
arXiv Detail & Related papers (2024-08-20T09:58:30Z)
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training [103.72844619581811]
We build performant Multimodal Large Language Models (MLLMs) In particular, we study the importance of various architecture components and data choices. We demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data.
arXiv Detail & Related papers (2024-03-14T17:51:32Z)
Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation [11.637738540262797]
This study introduces Semi-Mamba-UNet, which integrates a purely visual Mamba-based encoder-decoder architecture with a conventional CNN-based UNet into a semi-supervised learning framework. This innovative SSL approach leverages both networks to generate pseudo-labels and cross-supervise one another at the pixel level simultaneously. We introduce a self-supervised pixel-level contrastive learning strategy that employs a pair of projectors to enhance the feature learning capabilities further.
arXiv Detail & Related papers (2024-02-11T17:09:21Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images. We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations. The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image. Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z)
Multi-scale Transformer Network with Edge-aware Pre-training for Cross-Modality MR Image Synthesis [52.41439725865149]
Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones. Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model. We propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis.
arXiv Detail & Related papers (2022-12-02T11:40:40Z)
IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised Medical Image Segmentation [3.6748639131154315]
We extend the concept of metric learning to the segmentation task. We propose a simple convolutional projection head for obtaining dense pixel-level features. A bidirectional regularization mechanism involving two-stream regularization training is devised for the downstream task.
arXiv Detail & Related papers (2022-10-26T23:11:02Z)
Positional Contrastive Learning for Volumetric Medical Image Segmentation [13.086140606803408]
We propose a novel positional contrastive learning framework to generate contrastive data pairs. The proposed PCL method can substantially improve the segmentation performance compared to existing methods in both semi-supervised setting and transfer learning setting.
arXiv Detail & Related papers (2021-06-16T22:15:28Z)
Self-Ensembling Contrastive Learning for Semi-Supervised Medical Image Segmentation [6.889911520730388]
We aim to boost the performance of semi-supervised learning for medical image segmentation with limited labels. We learn latent representations directly at feature-level by imposing contrastive loss on unlabeled images. We conduct experiments on an MRI and a CT segmentation dataset and demonstrate that the proposed method achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-05-27T03:27:58Z)
Modeling the Probabilistic Distribution of Unlabeled Data forOne-shot Medical Image Segmentation [40.41161371507547]
We develop a data augmentation method for one-shot brain magnetic resonance imaging (MRI) image segmentation. Our method exploits only one labeled MRI image (named atlas) and a few unlabeled images. Our method outperforms the state-of-the-art one-shot medical segmentation methods.
arXiv Detail & Related papers (2021-02-03T12:28:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.