Towards Geospatial Foundation Models via Continual Pretraining
- URL: http://arxiv.org/abs/2302.04476v3
- Date: Thu, 31 Aug 2023 20:52:06 GMT
- Title: Towards Geospatial Foundation Models via Continual Pretraining
- Authors: Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen
- Abstract summary: We propose a novel paradigm for building highly effective foundation models with minimal resource cost and carbon impact.
We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile.
Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm.
- Score: 22.825065739563296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geospatial technologies are becoming increasingly essential in our world for
a wide range of applications, including agriculture, urban planning, and
disaster response. To help improve the applicability and performance of deep
learning models on these geospatial tasks, various works have begun
investigating foundation models for this domain. Researchers have explored two
prominent approaches for introducing such models in geospatial applications,
but both have drawbacks in terms of limited performance benefit or prohibitive
training cost. Therefore, in this work, we propose a novel paradigm for
building highly effective geospatial foundation models with minimal resource
cost and carbon impact. We first construct a compact yet diverse dataset from
multiple sources to promote feature diversity, which we term GeoPile. Then, we
investigate the potential of continual pretraining from large-scale
ImageNet-22k models and propose a multi-objective continual pretraining
paradigm, which leverages the strong representations of ImageNet while
simultaneously providing the freedom to learn valuable in-domain features. Our
approach outperforms previous state-of-the-art geospatial pretraining methods
in an extensive evaluation on seven downstream datasets covering various tasks
such as change detection, classification, multi-label classification, semantic
segmentation, and super-resolution.
Related papers
- Towards Vision-Language Geo-Foundation Model: A Survey [65.70547895998541]
Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks.
This paper thoroughly reviews VLGFMs, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2024-06-13T17:57:30Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - Segment Anything Model Can Not Segment Anything: Assessing AI Foundation
Model's Generalizability in Permafrost Mapping [19.307294875969827]
This paper introduces AI foundation models and their defining characteristics.
We evaluate the performance of large AI vision models, especially Meta's Segment Anything Model (SAM)
The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping.
arXiv Detail & Related papers (2024-01-16T19:10:09Z) - Few-shot Image Generation via Information Transfer from the Built
Geodesic Surface [2.617962830559083]
We propose a method called Information Transfer from the Built Geodesic Surface (ITBGS)
With the FAGS module, a pseudo-source domain is created by projecting image features from the training dataset into the Pre-Shape Space.
We demonstrate that the proposed method consistently achieves optimal or comparable results across a diverse range of semantically distinct datasets.
arXiv Detail & Related papers (2024-01-03T13:57:09Z) - Assessment of a new GeoAI foundation model for flood inundation mapping [4.312965283062856]
This paper evaluates the performance of the first-of-its-kind geospatial foundation model, IBM-NASA's Prithvi, to support a crucial geospatial analysis task: flood inundation mapping.
A benchmark dataset, Sen1Floods11, is used in the experiments, and the models' predictability, generalizability, and transferability are evaluated.
Results show the good transferability of the Prithvi model, highlighting its performance advantages in segmenting flooded areas in previously unseen regions.
arXiv Detail & Related papers (2023-09-25T19:50:47Z) - GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks.
This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z) - On the Opportunities and Challenges of Foundation Models for Geospatial
Artificial Intelligence [39.86997089245117]
Foundations models (FMs) can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or zero-shot learning.
We propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks.
arXiv Detail & Related papers (2023-04-13T19:50:17Z) - A General Purpose Neural Architecture for Geospatial Systems [142.43454584836812]
We present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias.
We envision how such a model may facilitate cooperation between members of the community.
arXiv Detail & Related papers (2022-11-04T09:58:57Z) - Toward Foundation Models for Earth Monitoring: Proposal for a Climate
Change Benchmark [95.19070157520633]
Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks.
Such models, recently coined as foundation models, have been transformational to the field of natural language processing.
We propose to develop a new benchmark comprised of a variety of downstream tasks related to climate change.
arXiv Detail & Related papers (2021-12-01T15:38:19Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.