Exploring Cross-Image Pixel Contrast for Semantic Segmentation
- URL: http://arxiv.org/abs/2101.11939v2
- Date: Sat, 30 Jan 2021 23:41:45 GMT
- Title: Exploring Cross-Image Pixel Contrast for Semantic Segmentation
- Authors: Wenguan Wang, Tianfei Zhou, Fisher Yu, Jifeng Dai, Ender Konukoglu,
Luc Van Gool
- Abstract summary: We propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.
The core idea is to enforce pixel embeddings belonging to a same semantic class to be more similar than embeddings from different classes.
Our method can be effortlessly incorporated into existing segmentation frameworks without extra overhead during testing.
- Score: 130.22216825377618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current semantic segmentation methods focus only on mining "local" context,
i.e., dependencies between pixels within individual images, by
context-aggregation modules (e.g., dilated convolution, neural attention) or
structure-aware optimization criteria (e.g., IoU-like loss). However, they
ignore "global" context of the training data, i.e., rich semantic relations
between pixels across different images. Inspired by the recent advance in
unsupervised contrastive representation learning, we propose a pixel-wise
contrastive framework for semantic segmentation in the fully supervised
setting. The core idea is to enforce pixel embeddings belonging to a same
semantic class to be more similar than embeddings from different classes. It
raises a pixel-wise metric learning paradigm for semantic segmentation, by
explicitly exploring the structures of labeled pixels, which are long ignored
in the field. Our method can be effortlessly incorporated into existing
segmentation frameworks without extra overhead during testing. We
experimentally show that, with famous segmentation models (i.e., DeepLabV3,
HRNet, OCR) and backbones (i.e., ResNet, HR-Net), our method brings consistent
performance improvements across diverse datasets (i.e., Cityscapes,
PASCAL-Context, COCO-Stuff). We expect this work will encourage our community
to rethink the current de facto training paradigm in fully supervised semantic
segmentation.
Related papers
- Learning Semantic Segmentation with Query Points Supervision on Aerial Images [57.09251327650334]
We present a weakly supervised learning algorithm to train semantic segmentation algorithms.
Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation.
arXiv Detail & Related papers (2023-09-11T14:32:04Z) - Hierarchical Open-vocabulary Universal Image Segmentation [48.008887320870244]
Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions.
We propose a decoupled text-image fusion mechanism and representation learning modules for both "things" and "stuff"
Our resulting model, named HIPIE tackles, HIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within a unified framework.
arXiv Detail & Related papers (2023-07-03T06:02:15Z) - SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic
Segmentation [52.62441404064957]
Domain adaptive semantic segmentation attempts to make satisfactory dense predictions on an unlabeled target domain by utilizing the model trained on a labeled source domain.
Many methods tend to alleviate noisy pseudo labels, however, they ignore intrinsic connections among cross-domain pixels with similar semantic concepts.
We propose Semantic-Guided Pixel Contrast (SePiCo), a novel one-stage adaptation framework that highlights the semantic concepts of individual pixel.
arXiv Detail & Related papers (2022-04-19T11:16:29Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z) - SPCL: A New Framework for Domain Adaptive Semantic Segmentation via
Semantic Prototype-based Contrastive Learning [6.705297811617307]
Domain adaptation can help in transferring knowledge from a labeled source domain to an unlabeled target domain.
We propose a novel semantic prototype-based contrastive learning framework for fine-grained class alignment.
Our method is easy to implement and attains superior results compared to state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-24T09:26:07Z) - Mining Contextual Information Beyond Image for Semantic Segmentation [37.783233906684444]
The paper studies the context aggregation problem in semantic image segmentation.
It proposes to mine the contextual information beyond individual images to further augment the pixel representations.
The proposed method could be effortlessly incorporated into existing segmentation frameworks.
arXiv Detail & Related papers (2021-08-26T14:34:23Z) - Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive
Learning [28.498782661888775]
We formulate weakly supervised segmentation as a semi-supervised metric learning problem.
We propose 4 types of contrastive relationships between pixels and segments in the feature space.
We deliver a universal weakly supervised segmenter with significant gains on Pascal VOC and DensePose.
arXiv Detail & Related papers (2021-05-03T15:49:01Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - From Pixel to Patch: Synthesize Context-aware Features for Zero-shot
Semantic Segmentation [22.88452754438478]
We focus on zero-shot semantic segmentation, which aims to segment unseen objects with only category-level semantic representations.
We propose a novel Context-aware feature Generation Network (CaGNet), which can synthesize context-aware pixel-wise visual features for unseen categories.
Experimental results on Pascal-VOC, Pascal-Context, and COCO-stuff show that our method significantly outperforms the existing zero-shot semantic segmentation methods.
arXiv Detail & Related papers (2020-09-25T13:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.