Voxel-level Siamese Representation Learning for Abdominal Multi-Organ
Segmentation
- URL: http://arxiv.org/abs/2105.07672v1
- Date: Mon, 17 May 2021 08:42:19 GMT
- Title: Voxel-level Siamese Representation Learning for Abdominal Multi-Organ
Segmentation
- Authors: Chae Eun Lee, Minyoung Chung, Yeong-Gil Shin
- Abstract summary: We propose a novel voxel-level Siamese representation learning method for abdominal multi-organ segmentation.
The proposed method enforces voxel-wise feature relations in the representation space for leveraging limited datasets more comprehensively.
Our experiments on the multi-organ dataset outperformed the existing approaches by 2% in Dice score coefficient.
- Score: 4.341575452368516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works in medical image segmentation have actively explored various
deep learning architectures or objective functions to encode high-level
features from volumetric data owing to limited image annotations. However, most
existing approaches tend to ignore cross-volume global context and define
context relations in the decision space. In this work, we propose a novel
voxel-level Siamese representation learning method for abdominal multi-organ
segmentation to improve representation space. The proposed method enforces
voxel-wise feature relations in the representation space for leveraging limited
datasets more comprehensively to achieve better performance. Inspired by recent
progress in contrastive learning, we suppressed voxel-wise relations from the
same class to be projected to the same point without using negative samples.
Moreover, we introduce a multi-resolution context aggregation method that
aggregates features from multiple hidden layers, which encodes both the global
and local contexts for segmentation. Our experiments on the multi-organ dataset
outperformed the existing approaches by 2% in Dice score coefficient. The
qualitative visualizations of the representation spaces demonstrate that the
improvements were gained primarily by a disentangled feature space.
Related papers
- Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning [10.630297877530614]
We propose a novel Multi-Grained Contrast method (MGC) for unsupervised representation learning.
Specifically, we construct delicate multi-grained correspondences between positive views and then conduct multi-grained contrast by the correspondences to learn more general unsupervised representations.
Our method significantly outperforms the existing state-of-the-art methods on extensive downstream tasks, including object detection, instance segmentation, scene parsing, semantic segmentation and keypoint detection.
arXiv Detail & Related papers (2024-07-02T07:35:21Z) - Generalizable Entity Grounding via Assistance of Large Language Model [77.07759442298666]
We propose a novel approach to densely ground visual entities from a long caption.
We leverage a large multimodal model to extract semantic nouns, a class-a segmentation model to generate entity-level segmentation, and a multi-modal feature fusion module to associate each semantic noun with its corresponding segmentation mask.
arXiv Detail & Related papers (2024-02-04T16:06:05Z) - Semantics-Aware Dynamic Localization and Refinement for Referring Image
Segmentation [102.25240608024063]
Referring image segments an image from a language expression.
We develop an algorithm that shifts from being localization-centric to segmentation-language.
Compared to its counterparts, our method is more versatile yet effective.
arXiv Detail & Related papers (2023-03-11T08:42:40Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Voxel-wise Adversarial Semi-supervised Learning for Medical Image
Segmentation [4.489713477369384]
We introduce a novel adversarial learning-based semi-supervised segmentation method for medical image segmentation.
Our method embeds both local and global features from multiple hidden layers and learns context relations between multiple classes.
Our method outperforms current best-performing state-of-the-art semi-supervised learning approaches on the image segmentation of the left atrium (single class) and multiorgan datasets (multiclass)
arXiv Detail & Related papers (2022-05-14T06:57:19Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - SATS: Self-Attention Transfer for Continual Semantic Segmentation [50.51525791240729]
continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning.
This study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements within each image.
The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model.
arXiv Detail & Related papers (2022-03-15T06:09:28Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - Momentum Contrastive Voxel-wise Representation Learning for
Semi-supervised Volumetric Medical Image Segmentation [2.3322477552758234]
We present a novel Contrastive Voxel-wise Representation (CVRL) method with geometric constraints to learn global-local visual representations for medical image segmentation.
Our framework can effectively learn global and local features by capturing 3D spatial context and rich anatomical information.
arXiv Detail & Related papers (2021-05-14T20:27:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.