EWMoE: An effective model for global weather forecasting with mixture-of-experts
- URL: http://arxiv.org/abs/2405.06004v2
- Date: Fri, 23 Aug 2024 05:30:55 GMT
- Title: EWMoE: An effective model for global weather forecasting with mixture-of-experts
- Authors: Lihao Gan, Xin Man, Chenghong Zhang, Jie Shao,
- Abstract summary: We propose EWMoE, an effective model for accurate global weather forecasting, which requires significantly less training data and computational resources.
Our model incorporates three key components to enhance prediction accuracy: 3D absolute position embedding, a core Mixture-of-Experts layer, and two specific loss functions.
- Score: 6.695845790670147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weather forecasting is a crucial task for meteorologic research, with direct social and economic impacts. Recently, data-driven weather forecasting models based on deep learning have shown great potential, achieving superior performance compared with traditional numerical weather prediction methods. However, these models often require massive training data and computational resources. In this paper, we propose EWMoE, an effective model for accurate global weather forecasting, which requires significantly less training data and computational resources. Our model incorporates three key components to enhance prediction accuracy: 3D absolute position embedding, a core Mixture-of-Experts (MoE) layer, and two specific loss functions. We conduct our evaluation on the ERA5 dataset using only two years of training data. Extensive experiments demonstrate that EWMoE outperforms current models such as FourCastNet and ClimaX at all forecast time, achieving competitive performance compared with the state-of-the-art models Pangu-Weather and GraphCast in evaluation metrics such as Anomaly Correlation Coefficient (ACC) and Root Mean Square Error (RMSE). Additionally, ablation studies indicate that applying the MoE architecture to weather forecasting offers significant advantages in improving accuracy and resource efficiency. Code is available at https://github.com/Tomoyi/EWMoE.
Related papers
- Deep Learning for Weather Forecasting: A CNN-LSTM Hybrid Model for Predicting Historical Temperature Data [7.559331742876793]
This study introduces a hybrid model combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to predict historical temperature data.
CNNs are utilized for spatial feature extraction, while LSTMs handle temporal dependencies, resulting in significantly improved prediction accuracy and stability.
arXiv Detail & Related papers (2024-10-19T03:38:53Z) - Weather Prediction Using CNN-LSTM for Time Series Analysis: A Case Study on Delhi Temperature Data [0.0]
This study explores a hybrid CNN-LSTM model to enhance temperature forecasting accuracy for the Delhi region.
We employed both direct and indirect methods, including comprehensive data preprocessing and exploratory analysis, to construct and train our model.
Experimental results indicate that the CNN-LSTM model significantly outperforms traditional forecasting methods in terms of both accuracy and stability.
arXiv Detail & Related papers (2024-09-14T11:06:07Z) - WeatherReal: A Benchmark Based on In-Situ Observations for Evaluating Weather Models [11.016845506758841]
We introduce WeatherReal, a novel benchmark dataset for weather forecasting derived from global near-surface in-situ observations.
This paper details the sources and processing methodologies underlying the dataset, and illustrates the advantage of in-situ observations in capturing hyper-local and extreme weather.
Our work aims to advance the AI-based weather forecasting research towards a more application-focused and operation-ready approach.
arXiv Detail & Related papers (2024-09-14T08:53:46Z) - Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region [62.09891513612252]
We focus on limited-area modeling and train our model specifically for localized region-level downstream tasks.
We consider the MENA region due to its unique climatic challenges, where accurate localized weather forecasting is crucial for managing water resources, agriculture and mitigating the impacts of extreme weather events.
Our study aims to validate the effectiveness of integrating parameter-efficient fine-tuning (PEFT) methodologies, specifically Low-Rank Adaptation (LoRA) and its variants, to enhance forecast accuracy, as well as training speed, computational resource utilization, and memory efficiency in weather and climate modeling for specific regions.
arXiv Detail & Related papers (2024-09-11T19:31:56Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation [67.20588721130623]
We develop an AI-based cyclic weather forecasting system, FengWu-4DVar.
FengWu-4DVar can incorporate observational data into the data-driven weather forecasting model.
Experiments on the simulated observational dataset demonstrate that FengWu-4DVar is capable of generating reasonable analysis fields.
arXiv Detail & Related papers (2023-12-16T02:07:56Z) - Deep Learning for Day Forecasts from Sparse Observations [60.041805328514876]
Deep neural networks offer an alternative paradigm for modeling weather conditions.
MetNet-3 learns from both dense and sparse data sensors and makes predictions up to 24 hours ahead for precipitation, wind, temperature and dew point.
MetNet-3 has a high temporal and spatial resolution, respectively, up to 2 minutes and 1 km as well as a low operational latency.
arXiv Detail & Related papers (2023-06-06T07:07:54Z) - W-MAE: Pre-trained weather model with masked autoencoder for
multi-variable weather forecasting [7.610811907813171]
We propose a Weather model with Masked AutoEncoder pre-training for weather forecasting.
W-MAE is pre-trained in a self-supervised manner to reconstruct spatial correlations within meteorological variables.
On the temporal scale, we fine-tune the pre-trained W-MAE to predict the future states of meteorological variables.
arXiv Detail & Related papers (2023-04-18T06:25:11Z) - GraphCast: Learning skillful medium-range global weather forecasting [107.40054095223779]
We introduce a machine learning-based method called "GraphCast", which can be trained directly from reanalysis data.
It predicts hundreds of weather variables, over 10 days at 0.25 degree resolution globally, in under one minute.
We show that GraphCast significantly outperforms the most accurate operational deterministic systems on 90% of 1380 verification targets.
arXiv Detail & Related papers (2022-12-24T18:15:39Z) - Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global
Weather Forecast [91.9372563527801]
We present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
For the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy.
Pangu-Weather supports a wide range of downstream forecast scenarios, including extreme weather forecast and large-member ensemble forecast in real-time.
arXiv Detail & Related papers (2022-11-03T17:19:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.