Cross-sensor self-supervised training and alignment for remote sensing
- URL: http://arxiv.org/abs/2405.09922v1
- Date: Thu, 16 May 2024 09:25:45 GMT
- Title: Cross-sensor self-supervised training and alignment for remote sensing
- Authors: Valerio Marsocci, Nicolas Audebert,
- Abstract summary: We introduce cross-sensor self-supervised training and alignment for remote sensing (X-STARS)
X-STARS can be applied to train models from scratch, or adapt large models pretrained on e.g low-resolution data to new high-resolution sensors.
We demonstrate that X-STARS outperforms the state-of-the-art by a significant margin with less data across various conditions of data availability and resolutions.
- Score: 2.1178416840822027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale "foundation models" have gained traction as a way to leverage the vast amounts of unlabeled remote sensing data collected every day. However, due to the multiplicity of Earth Observation satellites, these models should learn "sensor agnostic" representations, that generalize across sensor characteristics with minimal fine-tuning. This is complicated by data availability, as low-resolution imagery, such as Sentinel-2 and Landsat-8 data, are available in large amounts, while very high-resolution aerial or satellite data is less common. To tackle these challenges, we introduce cross-sensor self-supervised training and alignment for remote sensing (X-STARS). We design a self-supervised training loss, the Multi-Sensor Alignment Dense loss (MSAD), to align representations across sensors, even with vastly different resolutions. Our X-STARS can be applied to train models from scratch, or to adapt large models pretrained on e.g low-resolution EO data to new high-resolution sensors, in a continual pretraining framework. We collect and release MSC-France, a new multi-sensor dataset, on which we train our X-STARS models, then evaluated on seven downstream classification and segmentation tasks. We demonstrate that X-STARS outperforms the state-of-the-art by a significant margin with less data across various conditions of data availability and resolutions.
Related papers
- Adaptive Domain Learning for Cross-domain Image Denoising [57.4030317607274]
We present a novel adaptive domain learning scheme for cross-domain image denoising.
We use existing data from different sensors (source domain) plus a small amount of data from the new sensor (target domain)
The ADL training scheme automatically removes the data in the source domain that are harmful to fine-tuning a model for the target domain.
Also, we introduce a modulation module to adopt sensor-specific information (sensor type and ISO) to understand input data for image denoising.
arXiv Detail & Related papers (2024-11-03T08:08:26Z) - Physical-Layer Semantic-Aware Network for Zero-Shot Wireless Sensing [74.12670841657038]
Device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications.
Data heterogeneity in wireless signals and data privacy regulation of distributed sensing have been considered as the major challenges that hinder the wide applications of wireless sensing in large area networking systems.
We propose a novel zero-shot wireless sensing solution that allows models constructed in one or a limited number of locations to be directly transferred to other locations without any labeled data.
arXiv Detail & Related papers (2023-12-08T13:50:30Z) - USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite
Imagery [5.671254904219855]
We develop a new encoder architecture called USat that can input multi-spectral data from multiple sensors for self-supervised pre-training.
We integrate USat into a Masked Autoencoder (MAE) self-supervised pre-training procedure and find that a pre-trained USat outperforms state-of-the-art MAE models trained on remote sensing data.
arXiv Detail & Related papers (2023-12-02T19:17:04Z) - Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing [31.409817016287704]
Super-Resolution for remote sensing has the potential for huge impact on planet monitoring.
Despite a lot of attention, several inconsistencies and challenges have prevented it from being deployed in practice.
This work presents a new metric for super-resolution, CLIPScore, that corresponds far better with human judgments than previous metrics.
arXiv Detail & Related papers (2023-11-29T21:06:45Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - SEnSeI: A Deep Learning Module for Creating Sensor Independent Cloud
Masks [0.7340845393655052]
We introduce a novel neural network architecture -- Spectral ENcoder for SEnsor Independence (SEnSeI)
We focus on the problem of cloud masking, using several pre-existing datasets, and a new, freely available dataset for Sentinel-2.
Our model is shown to achieve state-of-the-art performance on the satellites it was trained on (Sentinel-2 and Landsat 8), and is able to extrapolate to sensors it has not seen during training such as Landsat 7, Per'uSat-1, and Sentinel-3 SLSTR.
arXiv Detail & Related papers (2021-11-16T10:47:10Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Learning to Detect Fortified Areas [0.0]
We consider the problem of classifying which areas of a given surface are fortified by for instance, roads, sidewalks, parking spaces, paved driveways and terraces.
We propose an algorithmic solution by designing a neural net embedding architecture that transforms data from all the different sensor systems into a new common representation.
arXiv Detail & Related papers (2021-05-26T08:03:42Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z) - High-Precision Digital Traffic Recording with Multi-LiDAR Infrastructure
Sensor Setups [0.0]
We investigate the impact of fused LiDAR point clouds compared to single LiDAR point clouds.
The evaluation of the extracted trajectories shows that a fused infrastructure approach significantly increases the tracking results and reaches accuracies within a few centimeters.
arXiv Detail & Related papers (2020-06-22T10:57:52Z) - Deep Soft Procrustes for Markerless Volumetric Sensor Alignment [81.13055566952221]
In this work, we improve markerless data-driven correspondence estimation to achieve more robust multi-sensor spatial alignment.
We incorporate geometric constraints in an end-to-end manner into a typical segmentation based model and bridge the intermediate dense classification task with the targeted pose estimation one.
Our model is experimentally shown to achieve similar results with marker-based methods and outperform the markerless ones, while also being robust to the pose variations of the calibration structure.
arXiv Detail & Related papers (2020-03-23T10:51:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.