HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting
- URL: http://arxiv.org/abs/2403.17016v1
- Date: Wed, 14 Feb 2024 22:10:52 GMT
- Title: HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting
- Authors: Vivek Ramavajjala,
- Abstract summary: We present HEAL-ViT, a novel architecture that uses ViT models on a spherical mesh.
HEAL-ViT produces weather forecasts that outperform the ECMWF IFS on key metrics.
- Score: 0.14504054468850663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, a variety of ML architectures and techniques have seen success in producing skillful medium range weather forecasts. In particular, Vision Transformer (ViT)-based models (e.g. Pangu-Weather, FuXi) have shown strong performance, working nearly "out-of-the-box" by treating weather data as a multi-channel image on a rectilinear grid. While a rectilinear grid is appropriate for 2D images, weather data is inherently spherical and thus heavily distorted at the poles on a rectilinear grid, leading to disproportionate compute being used to model data near the poles. Graph-based methods (e.g. GraphCast) do not suffer from this problem, as they map the longitude-latitude grid to a spherical mesh, but are generally more memory intensive and tend to need more compute resources for training and inference. While spatially homogeneous, the spherical mesh does not lend itself readily to be modeled by ViT-based models that implicitly rely on the rectilinear grid structure. We present HEAL-ViT, a novel architecture that uses ViT models on a spherical mesh, thus benefiting from both the spatial homogeneity enjoyed by graph-based models and efficient attention-based mechanisms exploited by transformers. HEAL-ViT produces weather forecasts that outperform the ECMWF IFS on key metrics, and demonstrate better bias accumulation and blurring than other ML weather prediction models. Further, the lowered compute footprint of HEAL-ViT makes it attractive for operational use as well, where other models in addition to a 6-hourly prediction model may be needed to produce the full set of operational forecasts required.
Related papers
- CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer [47.65152457550307]
We propose the geometric-inspired Circular Transformer (CirT) to model the cyclic characteristic of the graticule.
Experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models.
arXiv Detail & Related papers (2025-02-27T04:26:23Z) - A Geometry-Aware Message Passing Neural Network for Modeling Aerodynamics over Airfoils [61.60175086194333]
aerodynamics is a key problem in aerospace engineering, often involving flows interacting with solid objects such as airfoils.
Here, we consider modeling of incompressible flows over solid objects, wherein geometric structures are a key factor in determining aerodynamics.
To effectively incorporate geometries, we propose a message passing scheme that efficiently and expressively integrates the airfoil shape with the mesh representation.
These design choices lead to a purely data-driven machine learning framework known as GeoMPNN, which won the Best Student Submission award at the NeurIPS 2024 ML4CFD Competition, placing 4th overall.
arXiv Detail & Related papers (2024-12-12T16:05:39Z) - Exploring the Use of Machine Learning Weather Models in Data Assimilation [0.0]
GraphCast and NeuralGCM are two promising ML-based weather models, but their suitability for data assimilation remains under-explored.
We compare the TL/AD results of GraphCast and NeuralGCM with those of the Model for Prediction Across Scales - Atmosphere (MPAS-A), a well-established numerical weather prediction (NWP) model.
While the adjoint results of both GraphCast and NeuralGCM show some similarity to those of MPAS-A, they also exhibit unphysical noise at various vertical levels, raising concerns about their robustness for operational DA systems.
arXiv Detail & Related papers (2024-11-22T02:18:28Z) - Gridded Transformer Neural Processes for Large Unstructured Spatio-Temporal Data [47.14384085714576]
We introduce gridded pseudo-tokenPs to handle unstructured observations and a processor containing gridded pseudo-tokens that leverage efficient attention mechanisms.
Our method consistently outperforms a range of strong baselines on various synthetic and real-world regression tasks involving large-scale data.
The real-life experiments are performed on weather data, demonstrating the potential of our approach to bring performance and computational benefits when applied at scale in a weather modelling pipeline.
arXiv Detail & Related papers (2024-10-09T10:00:56Z) - Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling [55.13352174687475]
This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which Generalizes weather forecasts to Finer-grained Temporal scales.
Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale.
We introduce a lead time-aware training framework to promote the generalization of the model at different lead times.
arXiv Detail & Related papers (2024-05-22T16:21:02Z) - FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure
Graph Perspective [48.00240550685946]
Current state-of-the-art graph neural network (GNN)-based forecasting methods usually require both graph networks (e.g., GCN) and temporal networks (e.g., LSTM) to capture inter-series (spatial) dynamics and intra-series (temporal) dependencies, respectively.
We propose a novel Fourier Graph Neural Network (FourierGNN) by stacking our proposed Fourier Graph Operator (FGO) to perform matrix multiplications in Fourier space.
Our experiments on seven datasets have demonstrated superior performance with higher efficiency and fewer parameters compared with state-of-the-
arXiv Detail & Related papers (2023-11-10T17:13:26Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - A physics and data co-driven surrogate modeling approach for temperature
field prediction on irregular geometric domain [12.264200001067797]
We propose a novel physics and data co-driven surrogate modeling method for temperature field prediction.
Numerical results demonstrate that our method can significantly improve accuracy prediction on a smaller dataset.
arXiv Detail & Related papers (2022-03-15T08:43:24Z) - GTrans: Spatiotemporal Autoregressive Transformer with Graph Embeddings
for Nowcasting Extreme Events [5.672898304129217]
This paper proposes atemporal model, namely GTrans, that transforms data features into graph embeddings and predicts temporal dynamics with a transformer model.
According to our experiments, we demonstrate that GTrans can model spatial and temporal dynamics and nowcasts extreme events for datasets.
arXiv Detail & Related papers (2022-01-18T03:26:24Z) - Hyperbolic Variational Graph Neural Network for Modeling Dynamic Graphs [77.33781731432163]
We learn dynamic graph representation in hyperbolic space, for the first time, which aims to infer node representations.
We present a novel Hyperbolic Variational Graph Network, referred to as HVGNN.
In particular, to model the dynamics, we introduce a Temporal GNN (TGNN) based on a theoretically grounded time encoding approach.
arXiv Detail & Related papers (2021-04-06T01:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.