Enhancing Self-Supervised Learning for Remote Sensing with Elevation
Data: A Case Study with Scarce And High Level Semantic Labels
- URL: http://arxiv.org/abs/2304.06857v3
- Date: Tue, 20 Feb 2024 02:29:37 GMT
- Title: Enhancing Self-Supervised Learning for Remote Sensing with Elevation
Data: A Case Study with Scarce And High Level Semantic Labels
- Authors: Omar A. Casta\~no-Idarraga, Raul Ramos-Poll\'an, Freddie Kalaitzis
- Abstract summary: This work proposes a hybrid unsupervised and supervised learning method to pre-train models applied in Earth observation downstream tasks.
We combine a contrastive approach to pre-train models with a pixel-wise regression pre-text task to predict coarse elevation maps.
- Score: 1.534667887016089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work proposes a hybrid unsupervised and supervised learning method to
pre-train models applied in Earth observation downstream tasks when only a
handful of labels denoting very general semantic concepts are available. We
combine a contrastive approach to pre-train models with a pixel-wise regression
pre-text task to predict coarse elevation maps, which are commonly available
worldwide. We hypothesize that this will allow the model to pre-learn useful
representations, as there is generally some correlation between elevation maps
and targets in many remote sensing tasks. We assess the performance of our
approach on a binary semantic segmentation task and a binary image
classification task, both derived from a dataset created for the northwest of
Colombia. In both cases, we pre-train our models with 39k unlabeled images,
fine-tune them on the downstream tasks with only 80 labeled images, and
evaluate them with 2944 labeled images. Our experiments show that our methods,
GLCNet+Elevation for segmentation, and SimCLR+Elevation for classification,
outperform their counterparts without the pixel-wise regression pre-text task,
namely SimCLR and GLCNet, in terms of macro-average F1 Score and Mean
Intersection over Union (MIoU). Our study not only encourages the development
of pre-training methods that leverage readily available geographical
information, such as elevation data, to enhance the performance of
self-supervised methods when applied to Earth observation tasks, but also
promotes the use of datasets with high-level semantic labels, which are more
likely to be updated frequently. Project code can be found in this link
\href{https://github.com/omarcastano/Elevation-Aware-SSL}{https://github.com/omarcastano/Elevation-Aware-SSL}.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - A Semi-Paired Approach For Label-to-Image Translation [6.888253564585197]
We introduce the first semi-supervised (semi-paired) framework for label-to-image translation.
In the semi-paired setting, the model has access to a small set of paired data and a larger set of unpaired images and labels.
We propose a training algorithm for this shared network, and we present a rare classes sampling algorithm to focus on under-represented classes.
arXiv Detail & Related papers (2023-06-23T16:13:43Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - RGB-based Semantic Segmentation Using Self-Supervised Depth Pre-Training [77.62171090230986]
We propose an easily scalable and self-supervised technique that can be used to pre-train any semantic RGB segmentation method.
In particular, our pre-training approach makes use of automatically generated labels that can be obtained using depth sensors.
We show how our proposed self-supervised pre-training with HN-labels can be used to replace ImageNet pre-training.
arXiv Detail & Related papers (2020-02-06T11:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.