SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
- URL: http://arxiv.org/abs/2306.09424v2
- Date: Sun, 22 Oct 2023 13:59:26 GMT
- Title: SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
- Authors: Adam J. Stewart, Nils Lehmann, Isaac A. Corley, Yi Wang, Yi-Chia
Chang, Nassim Ait Ali Braham, Shradha Sehgal, Caleb Robinson, Arindam
Banerjee
- Abstract summary: The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites.
Despite the increasing popularity of deep learning and remote sensing, the majority of researchers still use decision trees and random forests for Landsat image analysis.
This paper introduces SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for Earth Observation for the Landsat family of satellites.
- Score: 8.34029977985994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Landsat program is the longest-running Earth observation program in
history, with 50+ years of data acquisition by 8 satellites. The multispectral
imagery captured by sensors onboard these satellites is critical for a wide
range of scientific fields. Despite the increasing popularity of deep learning
and remote sensing, the majority of researchers still use decision trees and
random forests for Landsat image analysis due to the prevalence of small
labeled datasets and lack of foundation models. In this paper, we introduce
SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for
Earth Observation for the Landsat family of satellites (including 3 sensors and
2 product levels) and the largest Landsat dataset in history (5M image
patches). Additionally, we modernize and re-release the L7 Irish and L8 Biome
cloud detection datasets, and introduce the first ML benchmark datasets for
Landsats 4-5 TM and Landsat 7 ETM+ SR. Finally, we pre-train the first
foundation models for Landsat imagery using SSL4EO-L and evaluate their
performance on multiple semantic segmentation tasks. All datasets and model
weights are available via the TorchGeo (https://github.com/microsoft/torchgeo)
library, making reproducibility and experimentation easy, and enabling
scientific advancements in the burgeoning field of remote sensing for a
multitude of downstream applications.
Related papers
- Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community [50.16478515591924]
We propose and train the novel LAE-DINO Model, the first open-vocabulary foundation object detector for the LAE task.
We conduct experiments on established remote sensing benchmark DIOR, DOTAv2.0, as well as our newly introduced 80-class LAE-80C benchmark.
Results demonstrate the advantages of the LAE-1M dataset and the effectiveness of the LAE-DINO method.
arXiv Detail & Related papers (2024-08-17T06:24:43Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - DiffusionSat: A Generative Foundation Model for Satellite Imagery [63.2807119794691]
We present DiffusionSat, to date the largest generative foundation model trained on a collection of publicly available large, high-resolution remote sensing datasets.
Our method produces realistic samples and can be used to solve multiple generative tasks including temporal generation, superresolution given multi-spectral inputs and in-painting.
arXiv Detail & Related papers (2023-12-06T16:53:17Z) - A Self-Supervised Approach to Land Cover Segmentation [1.0878040851638]
Land use/land cover change (LULC) maps are integral resources in earth science and agricultural research.
Due to the nature of such maps, the creation of LULC maps is often constrained by the time and human resources necessary to accurately annotate satellite imagery and remote sensing data.
Here, we demonstrate a self-supervised method of land cover segmentation that has no need for high-quality ground truth labels.
arXiv Detail & Related papers (2023-10-27T16:37:36Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - An Open Hyperspectral Dataset with Sea-Land-Cloud Ground-Truth from the
HYPSO-1 Satellite [0.0]
The HYPSO-1 Sea-Land-Cloud-Labeled dataset is an open dataset with 200 diverse hyperspectral images from the HYPSO-1 mission.
38 of these images from different countries include ground-truth labels at pixel-level totaling about 25 million spectral signatures labeled for sea/land/cloud categories.
arXiv Detail & Related papers (2023-08-25T21:35:22Z) - SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for
Self-Supervised Learning in Earth Observation [20.94411133447731]
Self-supervised pre-training bears potential to generate expressive representations without human annotation.
We share an unlabeled RS dataset SSL4EO-S12 to assemble a global, multimodal, and multi-seasonal corpus of satellite imagery.
arXiv Detail & Related papers (2022-11-13T23:38:27Z) - EarthNets: Empowering AI in Earth Observation [24.160463837610074]
Earth observation (EO) aims at monitoring the state of planet Earth using remote sensing data.
This paper presents a comprehensive review of more than 500 publicly published datasets.
We propose to measure, rank, and select datasets to build a new benchmark for model evaluation.
arXiv Detail & Related papers (2022-10-10T18:09:35Z) - Satellite Image Time Series Analysis for Big Earth Observation Data [50.591267188664666]
This paper describes sits, an open-source R package for satellite image time series analysis using machine learning.
We show that this approach produces high accuracy for land use and land cover maps through a case study in the Cerrado biome.
arXiv Detail & Related papers (2022-04-24T15:23:25Z) - Embedding Earth: Self-supervised contrastive pre-training for dense land
cover classification [61.44538721707377]
We present Embedding Earth a self-supervised contrastive pre-training method for leveraging the large availability of satellite imagery.
We observe significant improvements up to 25% absolute mIoU when pre-trained with our proposed method.
We find that learnt features can generalize between disparate regions opening up the possibility of using the proposed pre-training scheme.
arXiv Detail & Related papers (2022-03-11T16:14:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.