Contrastive Learning for Label-Efficient Semantic Segmentation
- URL: http://arxiv.org/abs/2012.06985v3
- Date: Tue, 6 Apr 2021 00:33:07 GMT
- Title: Contrastive Learning for Label-Efficient Semantic Segmentation
- Authors: Xiangyun Zhao, Raviteja Vemulapalli, Philip Mansfield, Boqing Gong,
Bradley Green, Lior Shapira, Ying Wu
- Abstract summary: Convolutional Neural Network (CNN) based semantic segmentation approaches have achieved impressive results by using large amounts of labeled data.
Deep CNNs trained with the de facto cross-entropy loss can easily overfit to small amounts of labeled data.
We propose a simple and effective contrastive learning-based training strategy in which we first pretrain the network using a pixel-wise, label-based contrastive loss.
- Score: 44.10416030868873
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collecting labeled data for the task of semantic segmentation is expensive
and time-consuming, as it requires dense pixel-level annotations. While recent
Convolutional Neural Network (CNN) based semantic segmentation approaches have
achieved impressive results by using large amounts of labeled training data,
their performance drops significantly as the amount of labeled data decreases.
This happens because deep CNNs trained with the de facto cross-entropy loss can
easily overfit to small amounts of labeled data. To address this issue, we
propose a simple and effective contrastive learning-based training strategy in
which we first pretrain the network using a pixel-wise, label-based contrastive
loss, and then fine-tune it using the cross-entropy loss. This approach
increases intra-class compactness and inter-class separability, thereby
resulting in a better pixel classifier. We demonstrate the effectiveness of the
proposed training strategy using the Cityscapes and PASCAL VOC 2012
segmentation datasets. Our results show that pretraining with the proposed
contrastive loss results in large performance gains (more than 20% absolute
improvement in some settings) when the amount of labeled data is limited. In
many settings, the proposed contrastive pretraining strategy, which does not
use any additional data, is able to match or outperform the widely-used
ImageNet pretraining strategy that uses more than a million additional labeled
images.
Related papers
- Semi-weakly Supervised Contrastive Representation Learning for Retinal
Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images.
We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - A Simple Baseline for Semi-supervised Semantic Segmentation with Strong
Data Augmentation [74.8791451327354]
We propose a simple yet effective semi-supervised learning framework for semantic segmentation.
A set of simple design and training techniques can collectively improve the performance of semi-supervised semantic segmentation significantly.
Our method achieves state-of-the-art results in the semi-supervised settings on the Cityscapes and Pascal VOC datasets.
arXiv Detail & Related papers (2021-04-15T06:01:39Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - Tackling the Problem of Limited Data and Annotations in Semantic
Segmentation [1.0152838128195467]
To tackle the problem of limited data annotations in image segmentation, different pre-trained models and CRF based methods are applied.
To this end, RotNet, DeeperCluster, and Semi&Weakly Supervised Learning (SWSL) pre-trained models are transferred and finetuned in a DeepLab-v2 baseline.
The results of my study show that, on this small dataset, using a pre-trained ResNet50 SWSL model gives results that are 7.4% better than applying an ImageNet pre-trained model.
arXiv Detail & Related papers (2020-07-14T21:11:11Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Reinforced active learning for image segmentation [34.096237671643145]
We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL)
An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled from a pool of unlabeled data.
Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems.
arXiv Detail & Related papers (2020-02-16T14:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.