Related papers: Towards Geospatial Foundation Models via Continual Pretraining

Towards Geospatial Foundation Models via Continual Pretraining

URL: http://arxiv.org/abs/2302.04476v3
Date: Thu, 31 Aug 2023 20:52:06 GMT
Title: Towards Geospatial Foundation Models via Continual Pretraining
Authors: Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen
Abstract summary: We propose a novel paradigm for building highly effective foundation models with minimal resource cost and carbon impact. We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile. Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm.
Score: 22.825065739563296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response. To help improve the applicability and performance of deep learning models on these geospatial tasks, various works have begun investigating foundation models for this domain. Researchers have explored two prominent approaches for introducing such models in geospatial applications, but both have drawbacks in terms of limited performance benefit or prohibitive training cost. Therefore, in this work, we propose a novel paradigm for building highly effective geospatial foundation models with minimal resource cost and carbon impact. We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile. Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm, which leverages the strong representations of ImageNet while simultaneously providing the freedom to learn valuable in-domain features. Our approach outperforms previous state-of-the-art geospatial pretraining methods in an extensive evaluation on seven downstream datasets covering various tasks such as change detection, classification, multi-label classification, semantic segmentation, and super-resolution.

Related papers

On the workflow, opportunities and challenges of developing foundation model in geophysics [9.358947092397052]
This paper systematically explores the entire process of developing foundation models in conjunction with geophysical data. Considering the diversity, complexity, and physical consistency constraints of geophysical data, we discuss targeted solutions. We discuss how to leverage the transfer learning capabilities of foundation models to reduce reliance on labeled data, enhance computational efficiency, and incorporate physical constraints into model training.
arXiv Detail & Related papers (2025-04-24T09:08:24Z)
GeoJEPA: Towards Eliminating Augmentation- and Sampling Bias in Multimodal Geospatial Learning [0.0]
We present GeoJEPA, a versatile multimodal fusion model for geospatial data built on the self-supervised Joint-Embedding Predictive Architecture. We aim to eliminate the widely accepted augmentation- and sampling biases found in self-supervised geospatial representation learning. The results are multimodal semantic representations of urban regions and map entities that we evaluate both quantitatively and qualitatively.
arXiv Detail & Related papers (2025-02-25T22:03:28Z)
Self-Supervised Representation Learning for Geospatial Objects: A Survey [21.504978593542354]
Self-supervised learning (SSL) has garnered increasing attention for its ability to learn effective and generalizable representations directly from data without extensive labeled supervision. This paper presents a survey of SSL techniques specifically applied to or developed for geospatial objects in three primary geometric vector types: Point, Polyline, and Polygon. We examine the emerging trends in SSL for geospatial objects, particularly the gradual advancements towards geospatial foundation models.
arXiv Detail & Related papers (2024-08-22T05:28:22Z)
Towards Vision-Language Geo-Foundation Model: A Survey [65.70547895998541]
Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks. This paper thoroughly reviews VLGFMs, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2024-06-13T17:57:30Z)
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images. We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage. We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z)
Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping [19.307294875969827]
This paper introduces AI foundation models and their defining characteristics. We evaluate the performance of large AI vision models, especially Meta's Segment Anything Model (SAM) The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping.
arXiv Detail & Related papers (2024-01-16T19:10:09Z)
Few-shot Image Generation via Information Transfer from the Built Geodesic Surface [2.617962830559083]
We propose a method called Information Transfer from the Built Geodesic Surface (ITBGS) With the FAGS module, a pseudo-source domain is created by projecting image features from the training dataset into the Pre-Shape Space. We demonstrate that the proposed method consistently achieves optimal or comparable results across a diverse range of semantically distinct datasets.
arXiv Detail & Related papers (2024-01-03T13:57:09Z)
Assessment of a new GeoAI foundation model for flood inundation mapping [4.312965283062856]
This paper evaluates the performance of the first-of-its-kind geospatial foundation model, IBM-NASA's Prithvi, to support a crucial geospatial analysis task: flood inundation mapping. A benchmark dataset, Sen1Floods11, is used in the experiments, and the models' predictability, generalizability, and transferability are evaluated. Results show the good transferability of the Prithvi model, highlighting its performance advantages in segmenting flooded areas in previously unseen regions.
arXiv Detail & Related papers (2023-09-25T19:50:47Z)
GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks. This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z)
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence [39.86997089245117]
Foundations models (FMs) can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or zero-shot learning. We propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks.
arXiv Detail & Related papers (2023-04-13T19:50:17Z)
A General Purpose Neural Architecture for Geospatial Systems [142.43454584836812]
We present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias. We envision how such a model may facilitate cooperation between members of the community.
arXiv Detail & Related papers (2022-11-04T09:58:57Z)
Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark [95.19070157520633]
Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks. Such models, recently coined as foundation models, have been transformational to the field of natural language processing. We propose to develop a new benchmark comprised of a variety of downstream tasks related to climate change.
arXiv Detail & Related papers (2021-12-01T15:38:19Z)
Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area. Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos. This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.