GraSS: Contrastive Learning with Gradient Guided Sampling Strategy for
Remote Sensing Image Semantic Segmentation
- URL: http://arxiv.org/abs/2306.15868v3
- Date: Tue, 28 Nov 2023 04:28:48 GMT
- Title: GraSS: Contrastive Learning with Gradient Guided Sampling Strategy for
Remote Sensing Image Semantic Segmentation
- Authors: Zhaoyang Zhang, Zhen Ren, Chao Tao, Yunsheng Zhang, Chengli Peng,
Haifeng Li
- Abstract summary: We propose contrastive learning with Gradient guided Sampling Strategy (GraSS) for RSI semantic segmentation.
GraSS consists of two stages: Instance Discrimination warm-up and Gradient guided Sampling contrastive training.
GraSS effectively enhances the performance of SSCL in high-resolution RSI semantic segmentation.
- Score: 14.750062497258147
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised contrastive learning (SSCL) has achieved significant
milestones in remote sensing image (RSI) understanding. Its essence lies in
designing an unsupervised instance discrimination pretext task to extract image
features from a large number of unlabeled images that are beneficial for
downstream tasks. However, existing instance discrimination based SSCL suffer
from two limitations when applied to the RSI semantic segmentation task: 1)
Positive sample confounding issue; 2) Feature adaptation bias. It introduces a
feature adaptation bias when applied to semantic segmentation tasks that
require pixel-level or object-level features. In this study, We observed that
the discrimination information can be mapped to specific regions in RSI through
the gradient of unsupervised contrastive loss, these specific regions tend to
contain singular ground objects. Based on this, we propose contrastive learning
with Gradient guided Sampling Strategy (GraSS) for RSI semantic segmentation.
GraSS consists of two stages: Instance Discrimination warm-up (ID warm-up) and
Gradient guided Sampling contrastive training (GS training). The ID warm-up
aims to provide initial discrimination information to the contrastive loss
gradients. The GS training stage aims to utilize the discrimination information
contained in the contrastive loss gradients and adaptively select regions in
RSI patches that contain more singular ground objects, in order to construct
new positive and negative samples. Experimental results on three open datasets
demonstrate that GraSS effectively enhances the performance of SSCL in
high-resolution RSI semantic segmentation. Compared to seven baseline methods
from five different types of SSCL, GraSS achieves an average improvement of
1.57\% and a maximum improvement of 3.58\% in terms of mean intersection over
the union. The source code is available at https://github.com/GeoX-Lab/GraSS
Related papers
- When Segmentation Meets Hyperspectral Image: New Paradigm for Hyperspectral Image Classification [4.179738334055251]
Hyperspectral image (HSI) classification is a cornerstone of remote sensing, enabling precise material and land-cover identification through rich spectral information.
While deep learning has driven significant progress in this task, small patch-based classifiers, which account for over 90% of the progress, face limitations.
We propose a novel paradigm and baseline, HSIseg, for HSI classification that leverages segmentation techniques combined with a novel Dynamic Shifted Regional Transformer (DSRT) to overcome these challenges.
arXiv Detail & Related papers (2025-02-18T05:04:29Z) - Decoupled Contrastive Learning for Long-Tailed Recognition [58.255966442426484]
Supervised Contrastive Loss (SCL) is popular in visual representation learning.
In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two types of positive samples equally leads to the biased optimization for intra-category distance.
We propose a patch-based self distillation to transfer knowledge from head to tail classes to relieve the under-representation of tail classes.
arXiv Detail & Related papers (2024-03-10T09:46:28Z) - Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z) - MisMatch: Calibrated Segmentation via Consistency on Differential
Morphological Feature Perturbations with Limited Labels [5.500466607182699]
Semi-supervised learning is a promising paradigm to address the issue of label scarcity in medical imaging.
MisMatch is a semi-supervised segmentation framework based on the consistency between paired predictions.
arXiv Detail & Related papers (2021-10-23T09:22:41Z) - Semi-Supervised Semantic Segmentation of Vessel Images using Leaking
Perturbations [1.5791732557395552]
Leaking GAN is a GAN-based semi-supervised architecture for retina vessel semantic segmentation.
Our key idea is to pollute the discriminator by leaking information from the generator.
This leads to more moderate generations that benefit the training of GAN.
arXiv Detail & Related papers (2021-10-22T18:25:08Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z) - Hyperspectral Image Super-Resolution with Spectral Mixup and
Heterogeneous Datasets [99.92564298432387]
This work studies Hyperspectral image (HSI) super-resolution (SR)
HSI SR is characterized by high-dimensional data and a limited amount of training examples.
This exacerbates the undesirable behaviors of neural networks such as memorization and sensitivity to out-of-distribution samples.
arXiv Detail & Related papers (2021-01-19T12:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.