A Survey of Foundation Models for Environmental Science
- URL: http://arxiv.org/abs/2503.03142v1
- Date: Wed, 05 Mar 2025 03:33:31 GMT
- Title: A Survey of Foundation Models for Environmental Science
- Authors: Runlong Yu, Shengyu Chen, Yiqun Xie, Xiaowei Jia,
- Abstract summary: Foundation models offer transformative opportunities by integrating diverse data sources.<n>We aim to foster interdisciplinary collaboration and advance the integration of cutting-edge machine learning for sustainable solutions in environmental science.
- Score: 16.426772639157704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling environmental ecosystems is essential for effective resource management, sustainable development, and understanding complex ecological processes. However, traditional methods frequently struggle with the inherent complexity, interconnectedness, and limited data of such systems. Foundation models, with their large-scale pre-training and universal representations, offer transformative opportunities by integrating diverse data sources, capturing spatiotemporal dependencies, and adapting to a broad range of tasks. This survey presents a comprehensive overview of foundation model applications in environmental science, highlighting advancements in forward prediction, data generation, data assimilation, downscaling, model ensembling, and decision-making across domains. We also detail the development process of these models, covering data collection, architecture design, training, tuning, and evaluation. By showcasing these emerging methods, we aim to foster interdisciplinary collaboration and advance the integration of cutting-edge machine learning for sustainable solutions in environmental science.
Related papers
- On the workflow, opportunities and challenges of developing foundation model in geophysics [9.358947092397052]
This paper systematically explores the entire process of developing foundation models in conjunction with geophysical data.
Considering the diversity, complexity, and physical consistency constraints of geophysical data, we discuss targeted solutions.
We discuss how to leverage the transfer learning capabilities of foundation models to reduce reliance on labeled data, enhance computational efficiency, and incorporate physical constraints into model training.
arXiv Detail & Related papers (2025-04-24T09:08:24Z) - A Comprehensive Survey of Synthetic Tabular Data Generation [27.112327373017457]
Tabular data is one of the most prevalent and critical data formats across diverse real-world applications.
It is often constrained by challenges such as data scarcity, privacy concerns, and class imbalance.
Synthetic data generation has emerged as a promising solution, leveraging generative models to learn the distribution of real datasets.
arXiv Detail & Related papers (2025-04-23T08:33:34Z) - Foundation Models for Environmental Science: A Survey of Emerging Frontiers [27.773985216421394]
This survey presents a comprehensive overview of foundation applications in environmental science.
It highlights advancements in common environmental use cases including forward prediction, data generation, data assimilation, downscaling, inverse modeling, model ensembling, and decision-making across domains.
We aim to promote interdisciplinary collaboration that accelerates advancements in machine learning for driving discovery in addressing critical environmental challenges.
arXiv Detail & Related papers (2025-04-05T20:56:38Z) - Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models [104.17057231661371]
Time series analysis is crucial for understanding dynamics of complex systems.
Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs)
Their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints.
This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.
arXiv Detail & Related papers (2025-03-14T13:53:46Z) - A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation.<n>We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs)<n>We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z) - Trajectory World Models for Heterogeneous Environments [67.27233466954814]
Heterogeneity in sensors and actuators across environments poses a significant challenge to building large-scale pre-trained world models.<n>We introduce UniTraj, a unified dataset comprising over one million trajectories from 80 environments, designed to scale data while preserving critical diversity.<n>We propose TrajWorld, a novel architecture capable of flexibly handling varying sensor and actuator information and capturing environment dynamics in-context.
arXiv Detail & Related papers (2025-02-03T13:59:08Z) - A Survey of World Models for Autonomous Driving [63.33363128964687]
Recent breakthroughs in autonomous driving have been propelled by advances in robust world modeling.
This paper systematically reviews recent advances in world models for autonomous driving.
arXiv Detail & Related papers (2025-01-20T04:00:02Z) - Predictive Pattern Recognition Techniques Towards Spatiotemporal Representation of Plant Growth in Simulated and Controlled Environments: A Comprehensive Review [0.0]
This review explores state-of-the-art predictive pattern recognition techniques.<n>We focus on the probabilistic modeling of plant traits and the integration of dynamic environmental interactions.<n>Key topics include regressions and neural network-based representation models for the task of forecasting.
arXiv Detail & Related papers (2024-12-13T20:22:35Z) - Foundation Models for Remote Sensing and Earth Observation: A Survey [101.77425018347557]
This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs)
It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts.
We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
arXiv Detail & Related papers (2024-10-22T01:08:21Z) - Research on the Spatial Data Intelligent Foundation Model [70.47828328840912]
This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models.
It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models.
The report systematically elucidates the key technologies of spatial data intelligent large models and their applications in urban environments, aerospace remote sensing, geography, transportation, and other scenarios.
arXiv Detail & Related papers (2024-05-30T06:21:34Z) - Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology using Large Language Models -- A Case in Optimizing Intermodal Freight Transportation [1.6230958216521798]
This study investigates the potential of leveraging the pre-trained Large Language Models (LLMs)
By adopting ChatGPT API as the reasoning core, we outline an integrated workflow that encompasses natural language processing, methontology-based prompt tuning, and transformers.
The outcomes of our methodology are knowledge graphs in widely adopted ontology languages (e.g., OWL, RDF, SPARQL)
arXiv Detail & Related papers (2024-05-29T16:40:31Z) - Using satellite imagery to understand and promote sustainable
development [87.72561825617062]
We synthesize the growing literature that uses satellite imagery to understand sustainable development outcomes.
We quantify the paucity of ground data on key human-related outcomes and the growing abundance and resolution of satellite imagery.
We review recent machine learning approaches to model-building in the context of scarce and noisy training data.
arXiv Detail & Related papers (2020-09-23T05:20:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.