Foundation Models for Generalist Geospatial Artificial Intelligence
- URL: http://arxiv.org/abs/2310.18660v2
- Date: Wed, 8 Nov 2023 18:25:24 GMT
- Title: Foundation Models for Generalist Geospatial Artificial Intelligence
- Authors: Johannes Jakubik, Sujit Roy, C. E. Phillips, Paolo Fraccaro, Denys
Godwin, Bianca Zadrozny, Daniela Szwarcman, Carlos Gomes, Gabby Nyirjesy,
Blair Edwards, Daiki Kimura, Naomi Simumba, Linsong Chu, S. Karthik
Mukkavilli, Devyani Lambhate, Kamal Das, Ranjini Bangalore, Dario Oliveira,
Michal Muszynski, Kumar Ankur, Muthukumaran Ramasubramanian, Iksha Gurung,
Sam Khallaghi, Hanxi (Steve) Li, Michael Cecil, Maryam Ahmadi, Fatemeh Kordi,
Hamed Alemohammad, Manil Maskey, Raghu Ganti, Kommy Weldemariam, Rahul
Ramachandran
- Abstract summary: This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive data.
We have utilized this framework to create Prithvi, a transformer-based foundational model pre-trained on more than 1TB of multispectral satellite imagery.
- Score: 3.7002058945990415
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant progress in the development of highly adaptable and reusable
Artificial Intelligence (AI) models is expected to have a significant impact on
Earth science and remote sensing. Foundation models are pre-trained on large
unlabeled datasets through self-supervision, and then fine-tuned for various
downstream tasks with small labeled datasets. This paper introduces a
first-of-a-kind framework for the efficient pre-training and fine-tuning of
foundational models on extensive geospatial data. We have utilized this
framework to create Prithvi, a transformer-based geospatial foundational model
pre-trained on more than 1TB of multispectral satellite imagery from the
Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the
efficacy of our framework in successfully fine-tuning Prithvi to a range of
Earth observation tasks that have not been tackled by previous work on
foundation models involving multi-temporal cloud gap imputation, flood mapping,
wildfire scar segmentation, and multi-temporal crop segmentation. Our
experiments show that the pre-trained model accelerates the fine-tuning process
compared to leveraging randomly initialized weights. In addition, pre-trained
Prithvi compares well against the state-of-the-art, e.g., outperforming a
conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%)
in the structural similarity index. Finally, due to the limited availability of
labeled data in the field of Earth observation, we gradually reduce the
quantity of available labeled data for refining the model to evaluate data
efficiency and demonstrate that data can be decreased significantly without
affecting the model's accuracy. The pre-trained 100 million parameter model and
corresponding fine-tuning workflows have been released publicly as open source
contributions to the global Earth sciences community through Hugging Face.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - Self-Supervised Radio Pre-training: Toward Foundational Models for Spectrogram Learning [6.1339395157466425]
Foundational deep learning (DL) models are general models, trained on diverse, diverse, and unlabelled datasets.
We introduce Masked Spectrogram Modeling, a novel self-supervised learning approach for pretraining foundational DL models on radio signals.
arXiv Detail & Related papers (2024-11-14T23:56:57Z) - HyperspectralViTs: General Hyperspectral Models for On-board Remote Sensing [21.192836739734435]
On-board processing of hyperspectral data with machine learning models would enable unprecedented amount of autonomy for a wide range of tasks.
This can enable early warning system and could allow new capabilities such as automated scheduling across constellations of satellites.
We propose fast and accurate machine learning architectures which support end-to-end training with data of high spectral dimension.
arXiv Detail & Related papers (2024-10-22T17:59:55Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Simulation-Enhanced Data Augmentation for Machine Learning Pathloss
Prediction [9.664420734674088]
This paper introduces a novel simulation-enhanced data augmentation method for machine learning pathloss prediction.
Our method integrates synthetic data generated from a cellular coverage simulator and independently collected real-world datasets.
The integration of synthetic data significantly improves the generalizability of the model in different environments.
arXiv Detail & Related papers (2024-02-03T00:38:08Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Ensemble Machine Learning Model Trained on a New Synthesized Dataset
Generalizes Well for Stress Prediction Using Wearable Devices [3.006016887654771]
We investigate the generalization ability of models built on datasets containing a small number of subjects, recorded in single study protocols.
We propose and evaluate the use of ensemble techniques by combining gradient boosting with an artificial neural network to measure predictive power on new, unseen data.
arXiv Detail & Related papers (2022-09-30T00:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.