Related papers: ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

URL: http://arxiv.org/abs/2404.14712v5
Date: Mon, 19 Aug 2024 15:20:09 GMT
Title: ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability
Authors: Xiao Wang, Siyan Liu, Aristeidis Tsaris, Jong-Youl Choi, Ashwin Aji, Ming Fan, Wei Zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash,
Abstract summary: We introduce the Oak Ridge Base Foundation Model for Earth System Predictability (ORBIT) ORBIT is the largest model of its kind and surpasses the current climate AI foundation model size by a thousandfold. Performance scaling tests on the Frontier supercomputer have demonstrated that ORBIT achieves 684 petaFLOPS to 1.6 exaFLOPS sustained throughput.
Score: 10.88886669820126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitations, we introduce the Oak Ridge Base Foundation Model for Earth System Predictability (ORBIT), an advanced vision transformer model that scales up to 113 billion parameters using a novel hybrid tensor-data orthogonal parallelism technique. As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold. Performance scaling tests conducted on the Frontier supercomputer have demonstrated that ORBIT achieves 684 petaFLOPS to 1.6 exaFLOPS sustained throughput, with scaling efficiency maintained at 41% to 85% across 49,152 AMD GPUs. These breakthroughs establish new advances in AI-driven climate modeling and demonstrate promise to significantly improve the Earth system predictability.

Related papers

Knowledge Grafting: A Mechanism for Optimizing AI Model Deployment in Resource-Constrained Environments [0.0]
We introduce knowledge grafting to optimize AI models for resource-constrained environments.<n>The approach achieves an 88.54% reduction in model size.<n>It can be extended across various edge computing scenarios.
arXiv Detail & Related papers (2025-07-25T13:37:45Z)
Advanced long-term earth system forecasting by learning the small-scale nature [74.19833913539053]
We present Triton, an AI framework designed to address this fundamental challenge.<n>Inspired by increasing grids to explicitly resolve small scales in numerical models, Triton employs a hierarchical architecture processing information across multiple resolutions to mitigate spectral bias.<n>We demonstrate Triton's superior performance on challenging forecast tasks, achieving stable year-long global temperature forecasts, skillful Kuroshio eddy predictions till 120 days, and high-fidelity turbulence simulations.
arXiv Detail & Related papers (2025-05-26T02:49:00Z)
AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability [84.52205243353761]
Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment. We investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation.
arXiv Detail & Related papers (2025-04-06T20:35:44Z)
Trajectory World Models for Heterogeneous Environments [67.27233466954814]
Heterogeneity in sensors and actuators across environments poses a significant challenge to building large-scale pre-trained world models.<n>We introduce UniTraj, a unified dataset comprising over one million trajectories from 80 environments, designed to scale data while preserving critical diversity.<n>We also propose TrajWorld, a novel architecture capable of handling varying sensor and actuator information and capturing environment dynamics in-context.
arXiv Detail & Related papers (2025-02-03T13:59:08Z)
Community Research Earth Digital Intelligence Twin (CREDIT) [0.0]
We introduce the Community Research Earth Digital Intelligence Twin (CREDIT) framework, developed at NSF NCAR. CREDIT provides a flexible, scalable, and user-friendly platform for training and deploying AI-based atmospheric models. We demonstrate CREDIT's potential through WXFormer, a novel deterministic vision transformer designed to predict atmospheric states autoregressively.
arXiv Detail & Related papers (2024-11-09T03:08:03Z)
MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling [68.69647625472464]
Downscaling, a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions. Previous downscaling methods lacked tailored designs for meteorology and encountered structural limitations. We propose a novel model called MambaDS, which enhances the utilization of multivariable correlations and topography information.
arXiv Detail & Related papers (2024-08-20T13:45:49Z)
A Scalable Real-Time Data Assimilation Framework for Predicting Turbulent Atmosphere Dynamics [8.012940782999975]
We introduce a generic real-time data assimilation framework and demonstrate its end-to-end performance on the Frontier supercomputer. This framework comprises two primary modules: an ensemble score filter (EnSF) and a vision transformer-based surrogate. We demonstrate both the strong and weak scaling of our framework up to 1024 GPUs on the Exascale supercomputer, Frontier.
arXiv Detail & Related papers (2024-07-16T20:44:09Z)
Foundation Models for Generalist Geospatial Artificial Intelligence [3.7002058945990415]
This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive data. We have utilized this framework to create Prithvi, a transformer-based foundational model pre-trained on more than 1TB of multispectral satellite imagery.
arXiv Detail & Related papers (2023-10-28T10:19:55Z)
STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning [82.03481509373037]
Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments. We introduce Transformer-based wORld Model (STORM), an efficient world model architecture that combines strong modeling and generation capabilities. Storm achieves a mean human performance of $126.7%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods.
arXiv Detail & Related papers (2023-10-14T16:42:02Z)
Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs. Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative. The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z)
EarthPT: a time series foundation model for Earth Observation [0.0]
We introduce EarthPT -- an Earth Observation (EO) pretrained transformer. We demonstrate that EarthPT is an effective forecaster that can accurately predict future pixel-level surface reflectances. We also demonstrate that embeddings learnt by EarthPT hold semantically meaningful information.
arXiv Detail & Related papers (2023-09-13T18:00:00Z)
A Comparative Study on Generative Models for High Resolution Solar Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states. Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z)
Predictive World Models from Real-World Partial Observations [66.80340484148931]
We present a framework for learning a probabilistic predictive world model for real-world road environments. While prior methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only.
arXiv Detail & Related papers (2023-01-12T02:07:26Z)
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting [27.60569643222878]
We propose Earthformer, a space-time Transformer for Earth system forecasting. The Transformer is based on a generic, flexible and efficient space-time attention block, named Cuboid Attention. Experiments on two real-world benchmarks about precipitation nowcasting and El Nino/Southerntemporaltion show Earthformer achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-07-12T20:52:26Z)
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments. We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.