Joint Self-Supervised Image-Volume Representation Learning with
Intra-Inter Contrastive Clustering
- URL: http://arxiv.org/abs/2212.01893v1
- Date: Sun, 4 Dec 2022 18:57:44 GMT
- Title: Joint Self-Supervised Image-Volume Representation Learning with
Intra-Inter Contrastive Clustering
- Authors: Duy M. H. Nguyen, Hoang Nguyen, Mai T. N. Truong, Tri Cao, Binh T.
Nguyen, Nhat Ho, Paul Swoboda, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag
- Abstract summary: Self-supervised learning can overcome the lack of labeled training samples by learning feature representations from unlabeled data.
Most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes.
We propose a novel framework for unsupervised joint learning on 2D and 3D data modalities.
- Score: 31.52291149830299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collecting large-scale medical datasets with fully annotated samples for
training of deep networks is prohibitively expensive, especially for 3D volume
data. Recent breakthroughs in self-supervised learning (SSL) offer the ability
to overcome the lack of labeled training samples by learning feature
representations from unlabeled data. However, most current SSL techniques in
the medical field have been designed for either 2D images or 3D volumes. In
practice, this restricts the capability to fully leverage unlabeled data from
numerous sources, which may include both 2D and 3D data. Additionally, the use
of these pre-trained networks is constrained to downstream tasks with
compatible data dimensions. In this paper, we propose a novel framework for
unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D
images or 2D slices extracted from 3D volumes, we construct an SSL task based
on a 2D contrastive clustering problem for distinct classes. The 3D volumes are
exploited by computing vectored embedding at each slice and then assembling a
holistic feature through deformable self-attention mechanisms in Transformer,
allowing incorporating long-range dependencies between slices inside 3D
volumes. These holistic features are further utilized to define a novel 3D
clustering agreement-based SSL task and masking embedding prediction inspired
by pre-trained language models. Experiments on downstream tasks, such as 3D
brain segmentation, lung nodule detection, 3D heart structures segmentation,
and abnormal chest X-ray detection, demonstrate the effectiveness of our joint
2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a
significant margin and also surpass various modern 2D and 3D SSL approaches.
Related papers
- Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation [68.60747298865394]
We propose a new cross-dimensional SSL framework based on a pseudo-3D transformation (CDSSL-P3D)
Specifically, we introduce an image transformation based on the im2col algorithm, which converts 2D images into a format consistent with 3D data.
This transformation enables seamless integration of 2D and 3D data, and facilitates cross-dimensional self-supervised learning for 3D medical image analysis.
arXiv Detail & Related papers (2024-06-03T02:57:25Z) - Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full
and Sparse Annotations [32.69359482975795]
This work presents a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to obtain continuous 3D retinal layer surfaces from OCT volumes.
Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction.
arXiv Detail & Related papers (2023-12-04T08:32:31Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - 3D Arterial Segmentation via Single 2D Projections and Depth Supervision
in Contrast-Enhanced CT Images [9.324710035242397]
Training 3D deep networks requires large amounts of manual 3D annotations from experts.
We propose a novel method to segment the 3D peripancreatic arteries solely from one annotated 2D projection.
We demonstrate that by annotating a single, randomly chosen projection for each training sample, we obtain comparable performance to annotating multiple 2D projections.
arXiv Detail & Related papers (2023-09-15T15:41:40Z) - Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud
Recognition [108.07591240357306]
We tackle the data scarcity challenge in few-shot point cloud recognition of 3D objects by using a joint prediction from a conventional 3D model and a well-trained 2D model.
We find out the crux is the less effective training for the ''joint hard samples'', which have high confidence prediction on different wrong labels.
Our proposed invariant training strategy, called InvJoint, does not only emphasize the training more on the hard samples, but also seeks the invariance between the conflicting 2D and 3D ambiguous predictions.
arXiv Detail & Related papers (2023-08-18T17:43:12Z) - Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic
Segmentation [82.47872784972861]
Cross-modal domain adaptation has been studied on the paired 2D image and 3D LiDAR data to ease the labeling costs for 3D LiDAR semantic segmentation (3DLSS) in the target domain.
This paper studies a new 3DLSS setting where a 2D dataset with semantic annotations and a paired but unannotated 2D image and 3D LiDAR data (target) are available.
To achieve 3DLSS in this scenario, we propose Cross-Modal and Cross-Domain Learning (CoMoDaL)
arXiv Detail & Related papers (2023-08-05T14:00:05Z) - Self-supervised learning via inter-modal reconstruction and feature
projection networks for label-efficient 3D-to-2D segmentation [4.5206601127476445]
We propose a novel convolutional neural network (CNN) and self-supervised learning (SSL) method for label-efficient 3D-to-2D segmentation.
Results on different datasets demonstrate that the proposed CNN significantly improves the state of the art in scenarios with limited labeled data by up to 8% in Dice score.
arXiv Detail & Related papers (2023-07-06T14:16:25Z) - Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud
Pre-training [65.75399500494343]
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for 2D and 3D computer vision.
We propose Joint-MAE, a 2D-3D joint MAE framework for self-supervised 3D point cloud pre-training.
arXiv Detail & Related papers (2023-02-27T17:56:18Z) - Super Images -- A New 2D Perspective on 3D Medical Imaging Analysis [0.0]
We present a simple yet effective 2D method to handle 3D data while efficiently embedding the 3D knowledge during training.
Our method generates a super-resolution image by stitching slices side by side in the 3D image.
While attaining equal, if not superior, results to 3D networks utilizing only 2D counterparts, the model complexity is reduced by around threefold.
arXiv Detail & Related papers (2022-05-05T09:59:03Z) - 3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images.
First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training.
Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration.
Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.