Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability
- URL: http://arxiv.org/abs/2409.00489v1
- Date: Sat, 31 Aug 2024 15:51:23 GMT
- Title: Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability
- Authors: Chia-Yu Hsu, Wenwen Li, Sizhe Wang,
- Abstract summary: This paper evaluates the recently released NASA-IBM GFM Prithvi for its predictive performance on high-level image analysis tasks.
Prithvi is one of the first open-source GFMs trained on time-series of high-resolution remote sensing imagery.
- Score: 3.7899026023232136
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Research on geospatial foundation models (GFMs) has become a trending topic in geospatial artificial intelligence (AI) research due to their potential for achieving high generalizability and domain adaptability, reducing model training costs for individual researchers. Unlike large language models, such as ChatGPT, constructing visual foundation models for image analysis, particularly in remote sensing, encountered significant challenges such as formulating diverse vision tasks into a general problem framework. This paper evaluates the recently released NASA-IBM GFM Prithvi for its predictive performance on high-level image analysis tasks across multiple benchmark datasets. Prithvi was selected because it is one of the first open-source GFMs trained on time-series of high-resolution remote sensing imagery. A series of experiments were designed to assess Prithvi's performance as compared to other pre-trained task-specific AI models in geospatial image analysis. New strategies, including band adaptation, multi-scale feature generation, and fine-tuning techniques, are introduced and integrated into an image analysis pipeline to enhance Prithvi's domain adaptation capability and improve model performance. In-depth analyses reveal Prithvi's strengths and weaknesses, offering insights for both improving Prithvi and developing future visual foundation models for geospatial tasks.
Related papers
- Advances in Diffusion Models for Image Data Augmentation: A Review of Methods, Models, Evaluation Metrics and Future Research Directions [6.2719115566879236]
Diffusion Models (DMs) have emerged as a powerful tool for image data augmentation.
DMs generate realistic and diverse images by learning the underlying data distribution.
Current challenges and future research directions in the field are discussed.
arXiv Detail & Related papers (2024-07-04T18:06:48Z) - Towards Vision-Language Geo-Foundation Model: A Survey [65.70547895998541]
Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks.
This paper thoroughly reviews VLGFMs, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2024-06-13T17:57:30Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - Towards Graph Foundation Models: A Survey and Beyond [66.37994863159861]
Foundation models have emerged as critical components in a variety of artificial intelligence applications.
The capabilities of foundation models to generalize and adapt motivate graph machine learning researchers to discuss the potential of developing a new graph learning paradigm.
This article introduces the concept of Graph Foundation Models (GFMs), and offers an exhaustive explanation of their key characteristics and underlying technologies.
arXiv Detail & Related papers (2023-10-18T09:31:21Z) - Assessment of a new GeoAI foundation model for flood inundation mapping [4.312965283062856]
This paper evaluates the performance of the first-of-its-kind geospatial foundation model, IBM-NASA's Prithvi, to support a crucial geospatial analysis task: flood inundation mapping.
A benchmark dataset, Sen1Floods11, is used in the experiments, and the models' predictability, generalizability, and transferability are evaluated.
Results show the good transferability of the Prithvi model, highlighting its performance advantages in segmenting flooded areas in previously unseen regions.
arXiv Detail & Related papers (2023-09-25T19:50:47Z) - On the Opportunities and Challenges of Foundation Models for Geospatial
Artificial Intelligence [39.86997089245117]
Foundations models (FMs) can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or zero-shot learning.
We propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks.
arXiv Detail & Related papers (2023-04-13T19:50:17Z) - Towards Geospatial Foundation Models via Continual Pretraining [22.825065739563296]
We propose a novel paradigm for building highly effective foundation models with minimal resource cost and carbon impact.
We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile.
Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm.
arXiv Detail & Related papers (2023-02-09T07:39:02Z) - Generalized Visual Quality Assessment of GAN-Generated Face Images [79.47386781978531]
We study the subjective and objective quality towards generalized quality assessment of GAN-generated face images (GFIs)
We develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms.
arXiv Detail & Related papers (2022-01-28T07:54:49Z) - A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly.
Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z) - Fine-Grained Image Analysis with Deep Learning: A Survey [146.22351342315233]
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition.
This paper attempts to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained image recognition and fine-grained image retrieval.
arXiv Detail & Related papers (2021-11-11T09:43:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.