Geography-Aware Self-Supervised Learning
- URL: http://arxiv.org/abs/2011.09980v7
- Date: Tue, 8 Mar 2022 05:44:33 GMT
- Title: Geography-Aware Self-Supervised Learning
- Authors: Kumar Ayush, Burak Uzkent, Chenlin Meng, Kumar Tanmay, Marshall Burke,
David Lobell, Stefano Ermon
- Abstract summary: We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
- Score: 79.4009241781968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning methods have significantly narrowed the gap between
supervised and unsupervised learning on computer vision tasks. In this paper,
we explore their application to geo-located datasets, e.g. remote sensing,
where unlabeled data is often abundant but labeled data is scarce. We first
show that due to their different characteristics, a non-trivial gap persists
between contrastive and supervised learning on standard benchmarks. To close
the gap, we propose novel training methods that exploit the spatio-temporal
structure of remote sensing data. We leverage spatially aligned images over
time to construct temporal positive pairs in contrastive learning and
geo-location to design pre-text tasks. Our experiments show that our proposed
method closes the gap between contrastive and supervised learning on image
classification, object detection and semantic segmentation for remote sensing.
Moreover, we demonstrate that the proposed method can also be applied to
geo-tagged ImageNet images, improving downstream performance on various tasks.
Project Webpage can be found at this link geography-aware-ssl.github.io.
Related papers
- Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction from LiDAR Data with Limited Annotations [1.3243401820948064]
Building footprint maps offer promise of precise footprint extraction without extensive post-processing.
Deep learning methods face challenges in generalization and label efficiency.
We propose terrain-aware self-supervised learning tailored to remote sensing.
arXiv Detail & Related papers (2023-11-02T12:34:23Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Scalable Self-Supervised Representation Learning from Spatiotemporal
Motion Trajectories for Multimodal Computer Vision [0.0]
We propose a self-supervised, unlabeled method for learning representations of geographic locations from GPS trajectories.
We show that reachability embeddings are semantically meaningful representations and result in 4-23% gain in performance as measured using area under precision-recall curve (AUPRC) metric.
arXiv Detail & Related papers (2022-10-07T02:41:02Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - Reachability Embeddings: Scalable Self-Supervised Representation
Learning from Markovian Trajectories for Geospatial Computer Vision [0.0]
We propose a self-supervised method for learning representations of geographic locations from unlabeled GPS trajectories.
A scalable and distributed algorithm is presented to compute image-like representations, called reachability summaries.
We show that reachability embeddings are semantically meaningful representations and result in 4-23% gain in performance.
arXiv Detail & Related papers (2021-10-24T20:10:22Z) - Geographical Knowledge-driven Representation Learning for Remote Sensing
Images [18.79154074365997]
We propose a Geographical Knowledge-driven Representation learning method for remote sensing images (GeoKR)
The global land cover products and geographical location associated with each remote sensing image are regarded as geographical knowledge.
A large scale pre-training dataset Levir-KR is proposed to support network pre-training.
arXiv Detail & Related papers (2021-07-12T09:23:15Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Learning Invariant Representations for Reinforcement Learning without
Reconstruction [98.33235415273562]
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
Bisimulation metrics quantify behavioral similarity between states in continuous MDPs.
We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks.
arXiv Detail & Related papers (2020-06-18T17:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.