Foundation Models for Remote Sensing and Earth Observation: A Survey
- URL: http://arxiv.org/abs/2410.16602v2
- Date: Sat, 26 Oct 2024 03:06:48 GMT
- Title: Foundation Models for Remote Sensing and Earth Observation: A Survey
- Authors: Aoran Xiao, Weihao Xuan, Junjue Wang, Jiaxing Huang, Dacheng Tao, Shijian Lu, Naoto Yokoya,
- Abstract summary: This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs)
It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts.
We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
- Score: 101.77425018347557
- License:
- Abstract: Remote Sensing (RS) is a crucial technology for observing, monitoring, and interpreting our planet, with broad applications across geoscience, economics, humanitarian fields, etc. While artificial intelligence (AI), particularly deep learning, has achieved significant advances in RS, unique challenges persist in developing more intelligent RS systems, including the complexity of Earth's environments, diverse sensor modalities, distinctive feature patterns, varying spatial and spectral resolutions, and temporal dynamics. Meanwhile, recent breakthroughs in large Foundation Models (FMs) have expanded AI's potential across many domains due to their exceptional generalizability and zero-shot transfer capabilities. However, their success has largely been confined to natural data like images and video, with degraded performance and even failures for RS data of various non-optical modalities. This has inspired growing interest in developing Remote Sensing Foundation Models (RSFMs) to address the complex demands of Earth Observation (EO) tasks, spanning the surface, atmosphere, and oceans. This survey systematically reviews the emerging field of RSFMs. It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts. It then categorizes and reviews existing RSFM studies including their datasets and technical contributions across Visual Foundation Models (VFMs), Visual-Language Models (VLMs), Large Language Models (LLMs), and beyond. In addition, we benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions in this rapidly evolving field. A project associated with this survey has been built at https://github.com/xiaoaoran/awesome-RSFMs .
Related papers
- From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - AI Foundation Models in Remote Sensing: A Survey [6.036426846159163]
This paper provides a comprehensive survey of foundation models in the remote sensing domain.
We categorize these models based on their applications in computer vision and domain-specific tasks.
We highlight emerging trends and the significant advancements achieved by these foundation models.
arXiv Detail & Related papers (2024-08-06T22:39:34Z) - Towards Vision-Language Geo-Foundation Model: A Survey [65.70547895998541]
Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks.
This paper thoroughly reviews VLGFMs, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2024-06-13T17:57:30Z) - Research on the Spatial Data Intelligent Foundation Model [70.47828328840912]
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models.
It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models.
The report systematically elucidates the key technologies of spatial data intelligent large models and their applications in urban environments, aerospace remote sensing, geography, transportation, and other scenarios.
arXiv Detail & Related papers (2024-05-30T06:21:34Z) - Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond [101.15395503285804]
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI)
In this survey, we embark on a comprehensive exploration of the latest advancements in world models.
We examine challenges and limitations of world models, and discuss their potential future directions.
arXiv Detail & Related papers (2024-05-06T14:37:07Z) - Segment Anything Model Can Not Segment Anything: Assessing AI Foundation
Model's Generalizability in Permafrost Mapping [19.307294875969827]
This paper introduces AI foundation models and their defining characteristics.
We evaluate the performance of large AI vision models, especially Meta's Segment Anything Model (SAM)
The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping.
arXiv Detail & Related papers (2024-01-16T19:10:09Z) - Forging Vision Foundation Models for Autonomous Driving: Challenges,
Methodologies, and Opportunities [59.02391344178202]
Vision foundation models (VFMs) serve as potent building blocks for a wide range of AI applications.
The scarcity of comprehensive training data, the need for multi-sensor integration, and the diverse task-specific architectures pose significant obstacles to the development of VFMs.
This paper delves into the critical challenge of forging VFMs tailored specifically for autonomous driving, while also outlining future directions.
arXiv Detail & Related papers (2024-01-16T01:57:24Z) - SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery [35.550999964460466]
We present SkySense, a generic billion-scale model, pre-trained on a curated multi-modal Remote Sensing dataset with 21.5 million temporal sequences.
To our best knowledge, SkySense is the largest Multi-Modal to date, whose modules can be flexibly combined or used individually to accommodate various tasks.
arXiv Detail & Related papers (2023-12-15T09:57:21Z) - When Geoscience Meets Foundation Models: Towards General Geoscience Artificial Intelligence System [6.445323648941926]
Geoscience foundation models (GFMs) are a paradigm-shifting solution, integrating extensive cross-disciplinary data to enhance the simulation and understanding of Earth system dynamics.
The unique strengths of GFMs include flexible task specification, diverse input-output capabilities, and multi-modal knowledge representation.
This review offers a comprehensive overview of emerging geoscientific research paradigms, emphasizing the untapped opportunities at the intersection of advanced AI techniques and geoscience.
arXiv Detail & Related papers (2023-09-13T08:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.