Attention-Guided Supervised Contrastive Learning for Semantic
Segmentation
- URL: http://arxiv.org/abs/2106.01596v1
- Date: Thu, 3 Jun 2021 05:01:11 GMT
- Title: Attention-Guided Supervised Contrastive Learning for Semantic
Segmentation
- Authors: Ho Hin Lee, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Bennett A.
Landman, Yuankai Huo
- Abstract summary: In a per-pixel prediction task, more than one label can exist in a single image for segmentation.
We propose an attention-guided supervised contrastive learning approach to highlight a single semantic object every time as the target.
- Score: 16.729068267453897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has shown superior performance in embedding global and
spatial invariant features in computer vision (e.g., image classification).
However, its overall success of embedding local and spatial variant features is
still limited, especially for semantic segmentation. In a per-pixel prediction
task, more than one label can exist in a single image for segmentation (e.g.,
an image contains both cat, dog, and grass), thereby it is difficult to define
'positive' or 'negative' pairs in a canonical contrastive learning setting. In
this paper, we propose an attention-guided supervised contrastive learning
approach to highlight a single semantic object every time as the target. With
our design, the same image can be embedded to different semantic clusters with
semantic attention (i.e., coerce semantic masks) as an additional input
channel. To achieve such attention, a novel two-stage training strategy is
presented. We evaluate the proposed method on multi-organ medical image
segmentation task, as our major task, with both in-house data and BTCV 2015
datasets. Comparing with the supervised and semi-supervised training
state-of-the-art in the backbone of ResNet-50, our proposed pipeline yields
substantial improvement of 5.53% and 6.09% in Dice score for both medical image
segmentation cohorts respectively. The performance of the proposed method on
natural images is assessed via PASCAL VOC 2012 dataset, and achieves 2.75%
substantial improvement.
Related papers
- Exploring Open-Vocabulary Semantic Segmentation without Human Labels [76.15862573035565]
We present ZeroSeg, a novel method that leverages the existing pretrained vision-language model (VL) to train semantic segmentation models.
ZeroSeg overcomes this by distilling the visual concepts learned by VL models into a set of segment tokens, each summarizing a localized region of the target image.
Our approach achieves state-of-the-art performance when compared to other zero-shot segmentation methods under the same training data.
arXiv Detail & Related papers (2023-06-01T08:47:06Z) - Cross-level Contrastive Learning and Consistency Constraint for
Semi-supervised Medical Image Segmentation [46.678279106837294]
We propose a cross-level constrastive learning scheme to enhance representation capacity for local features in semi-supervised medical image segmentation.
With the help of the cross-level contrastive learning and consistency constraint, the unlabelled data can be effectively explored to improve segmentation performance.
arXiv Detail & Related papers (2022-02-08T15:12:11Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Exploring Feature Representation Learning for Semi-supervised Medical
Image Segmentation [30.608293915653558]
We present a two-stage framework for semi-supervised medical image segmentation.
Key insight is to explore the feature representation learning with labeled and unlabeled (i.e., pseudo labeled) images.
A stage-adaptive contrastive learning method is proposed, containing a boundary-aware contrastive loss.
We present an aleatoric uncertainty-aware method, namely AUA, to generate higher-quality pseudo labels.
arXiv Detail & Related papers (2021-11-22T05:06:12Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Unsupervised Image Segmentation by Mutual Information Maximization and
Adversarial Regularization [7.165364364478119]
We propose a novel fully unsupervised semantic segmentation method, the so-called Information Maximization and Adrial Regularization (InMARS)
Inspired by human perception which parses a scene into perceptual groups, our proposed approach first partitions an input image into meaningful regions (also known as superpixels)
Next, it utilizes Mutual-Information-Maximization followed by an adversarial training strategy to cluster these regions into semantically meaningful classes.
Our experiments demonstrate that our method achieves the state-of-the-art performance on two commonly used unsupervised semantic segmentation datasets.
arXiv Detail & Related papers (2021-07-01T18:36:27Z) - Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation [16.517086214275654]
We present a novel semi-supervised 2D medical segmentation solution that applies Contrastive Learning (CL) on image patches, instead of full images.
These patches are meaningfully constructed using the semantic information of different classes obtained via pseudo labeling.
We also propose a novel consistency regularization scheme, which works in synergy with contrastive learning.
arXiv Detail & Related papers (2021-06-12T15:43:24Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Dense Contrastive Learning for Self-Supervised Visual Pre-Training [102.15325936477362]
We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only 1% slower)
arXiv Detail & Related papers (2020-11-18T08:42:32Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.