SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
- URL: http://arxiv.org/abs/2306.09424v2
- Date: Sun, 22 Oct 2023 13:59:26 GMT
- Title: SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
- Authors: Adam J. Stewart, Nils Lehmann, Isaac A. Corley, Yi Wang, Yi-Chia
Chang, Nassim Ait Ali Braham, Shradha Sehgal, Caleb Robinson, Arindam
Banerjee
- Abstract summary: The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites.
Despite the increasing popularity of deep learning and remote sensing, the majority of researchers still use decision trees and random forests for Landsat image analysis.
This paper introduces SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for Earth Observation for the Landsat family of satellites.
- Score: 8.34029977985994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Landsat program is the longest-running Earth observation program in
history, with 50+ years of data acquisition by 8 satellites. The multispectral
imagery captured by sensors onboard these satellites is critical for a wide
range of scientific fields. Despite the increasing popularity of deep learning
and remote sensing, the majority of researchers still use decision trees and
random forests for Landsat image analysis due to the prevalence of small
labeled datasets and lack of foundation models. In this paper, we introduce
SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for
Earth Observation for the Landsat family of satellites (including 3 sensors and
2 product levels) and the largest Landsat dataset in history (5M image
patches). Additionally, we modernize and re-release the L7 Irish and L8 Biome
cloud detection datasets, and introduce the first ML benchmark datasets for
Landsats 4-5 TM and Landsat 7 ETM+ SR. Finally, we pre-train the first
foundation models for Landsat imagery using SSL4EO-L and evaluate their
performance on multiple semantic segmentation tasks. All datasets and model
weights are available via the TorchGeo (https://github.com/microsoft/torchgeo)
library, making reproducibility and experimentation easy, and enabling
scientific advancements in the burgeoning field of remote sensing for a
multitude of downstream applications.
Related papers
- EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision [72.84868704100595]
This paper presents a dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks.
The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic.
Accompanying the dataset is EarthMAE, a tailored Masked Autoencoder developed to tackle the distinct challenges of remote sensing data.
arXiv Detail & Related papers (2025-01-14T13:42:22Z) - Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community [58.417475846791234]
We propose and train the novel LAE-DINO Model, the first open-vocabulary foundation object detector for the LAE task.
We conduct experiments on established remote sensing benchmark DIOR, DOTAv2.0, as well as our newly introduced 80-class LAE-80C benchmark.
Results demonstrate the advantages of the LAE-1M dataset and the effectiveness of the LAE-DINO method.
arXiv Detail & Related papers (2024-08-17T06:24:43Z) - DiffusionSat: A Generative Foundation Model for Satellite Imagery [63.2807119794691]
We present DiffusionSat, to date the largest generative foundation model trained on a collection of publicly available large, high-resolution remote sensing datasets.
Our method produces realistic samples and can be used to solve multiple generative tasks including temporal generation, superresolution given multi-spectral inputs and in-painting.
arXiv Detail & Related papers (2023-12-06T16:53:17Z) - A Self-Supervised Approach to Land Cover Segmentation [1.0878040851638]
Land use/land cover change (LULC) maps are integral resources in earth science and agricultural research.
Due to the nature of such maps, the creation of LULC maps is often constrained by the time and human resources necessary to accurately annotate satellite imagery and remote sensing data.
Here, we demonstrate a self-supervised method of land cover segmentation that has no need for high-quality ground truth labels.
arXiv Detail & Related papers (2023-10-27T16:37:36Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - An Open Hyperspectral Dataset with Sea-Land-Cloud Ground-Truth from the
HYPSO-1 Satellite [0.0]
The HYPSO-1 Sea-Land-Cloud-Labeled dataset is an open dataset with 200 diverse hyperspectral images from the HYPSO-1 mission.
38 of these images from different countries include ground-truth labels at pixel-level totaling about 25 million spectral signatures labeled for sea/land/cloud categories.
arXiv Detail & Related papers (2023-08-25T21:35:22Z) - EarthNets: Empowering AI in Earth Observation [24.160463837610074]
Earth observation (EO) aims at monitoring the state of planet Earth using remote sensing data.
This paper presents a comprehensive review of more than 500 publicly published datasets.
We propose to measure, rank, and select datasets to build a new benchmark for model evaluation.
arXiv Detail & Related papers (2022-10-10T18:09:35Z) - Satellite Image Time Series Analysis for Big Earth Observation Data [50.591267188664666]
This paper describes sits, an open-source R package for satellite image time series analysis using machine learning.
We show that this approach produces high accuracy for land use and land cover maps through a case study in the Cerrado biome.
arXiv Detail & Related papers (2022-04-24T15:23:25Z) - Embedding Earth: Self-supervised contrastive pre-training for dense land
cover classification [61.44538721707377]
We present Embedding Earth a self-supervised contrastive pre-training method for leveraging the large availability of satellite imagery.
We observe significant improvements up to 25% absolute mIoU when pre-trained with our proposed method.
We find that learnt features can generalize between disparate regions opening up the possibility of using the proposed pre-training scheme.
arXiv Detail & Related papers (2022-03-11T16:14:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.