CA-MoE: Channel-Adapted MoE for Incremental Weather Forecasting
- URL: http://arxiv.org/abs/2412.02503v1
- Date: Tue, 03 Dec 2024 15:30:52 GMT
- Title: CA-MoE: Channel-Adapted MoE for Incremental Weather Forecasting
- Authors: Hao Chen, Han Tao, Guo Song, Jie Zhang, Yunlong Yu, Yonghan Dong, Chuang Yang, Lei Bai,
- Abstract summary: We introduce incremental learning to weather forecasting and propose a novel structure that allows for the flexible expansion of variables within the model.
Specifically, our method presents a Channel-Adapted MoE (CA-MoE) that employs a divide-and-conquer strategy.
Experiments conducted on the widely utilized ERA5 dataset reveal that our method, utilizing only approximately 15% of trainable parameters during the incremental stage, attains performance that is on par with state-of-the-art competitors.
- Score: 20.84335120477223
- License:
- Abstract: Atmospheric science is intricately connected with other fields, e.g., geography and aerospace. Most existing approaches involve training a joint atmospheric and geographic model from scratch, which incurs significant computational costs and overlooks the potential for incremental learning of weather variables across different domains. In this paper, we introduce incremental learning to weather forecasting and propose a novel structure that allows for the flexible expansion of variables within the model. Specifically, our method presents a Channel-Adapted MoE (CA-MoE) that employs a divide-and-conquer strategy. This strategy assigns variable training tasks to different experts by index embedding and reduces computational complexity through a channel-wise Top-K strategy. Experiments conducted on the widely utilized ERA5 dataset reveal that our method, utilizing only approximately 15\% of trainable parameters during the incremental stage, attains performance that is on par with state-of-the-art competitors. Notably, in the context of variable incremental experiments, our method demonstrates negligible issues with catastrophic forgetting.
Related papers
- Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region [62.09891513612252]
We focus on limited-area modeling and train our model specifically for localized region-level downstream tasks.
We consider the MENA region due to its unique climatic challenges, where accurate localized weather forecasting is crucial for managing water resources, agriculture and mitigating the impacts of extreme weather events.
Our study aims to validate the effectiveness of integrating parameter-efficient fine-tuning (PEFT) methodologies, specifically Low-Rank Adaptation (LoRA) and its variants, to enhance forecast accuracy, as well as training speed, computational resource utilization, and memory efficiency in weather and climate modeling for specific regions.
arXiv Detail & Related papers (2024-09-11T19:31:56Z) - Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts [75.85448576746373]
We propose a method of grouping and pruning similar experts to improve the model's parameter efficiency.
We validate the effectiveness of our method by pruning three state-of-the-art MoE architectures.
The evaluation shows that our method outperforms other model pruning methods on a range of natural language tasks.
arXiv Detail & Related papers (2024-07-12T17:25:02Z) - VarteX: Enhancing Weather Forecast through Distributed Variable Representation [5.2980803808373516]
Recent data-driven models have outperformed numerical weather prediction by utilizing deep learning in forecasting performance.
This study proposes a new variable aggregation scheme and an efficient learning framework for that challenge.
arXiv Detail & Related papers (2024-06-28T02:42:30Z) - Analyzing and Exploring Training Recipes for Large-Scale Transformer-Based Weather Prediction [1.3194391758295114]
We show that it is possible to attain high forecast skill even with relatively off-the-shelf architectures, simple training procedures, and moderate compute budgets.
Specifically, we train a minimally modified SwinV2 transformer on ERA5 data, and find that it attains superior forecast skill when compared against IFS.
arXiv Detail & Related papers (2024-04-30T15:30:14Z) - MetaSD: A Unified Framework for Scalable Downscaling of Meteorological Variables in Diverse Situations [8.71735078449217]
This paper proposes a unified downscaling approach leveraging meta-learning.
We trained variables consisted of temperature, wind, surface pressure and total precipitation from ERA5 and GFS.
The proposed method can be extended to downscale convective precipitation, potential, energy height, humidity CFS, S2S and CMIP6 at differenttemporal scales.
arXiv Detail & Related papers (2024-04-26T06:31:44Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - Data-Driven Evaluation of Training Action Space for Reinforcement
Learning [1.370633147306388]
This paper proposes a Shapley-inspired methodology for training action space categorization and ranking.
To reduce exponential-time shapley computations, the methodology includes a Monte Carlo simulation.
The proposed data-driven methodology is RL to different domains, use cases, and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-08T04:53:43Z) - Reservoir Computing as a Tool for Climate Predictability Studies [0.0]
We show that Reservoir Computing provides an alternative nonlinear approach that improves on the predictive skill of the Linear-Inverse-Modeling approach.
The improved predictive skill of the RC approach over a wide range of conditions suggests that this machine-learning technique may have a use in climate predictability studies.
arXiv Detail & Related papers (2021-02-24T22:22:59Z) - Learning to Continuously Optimize Wireless Resource In Episodically
Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment.
We propose to build the notion of continual learning into the modeling process of learning wireless systems.
Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z) - Unsupervised Dense Shape Correspondence using Heat Kernels [50.682560435495034]
We propose an unsupervised method for learning dense correspondences between shapes using a recent deep functional map framework.
Instead of depending on ground-truth correspondences or the computationally expensive geodesic distances, we use heat kernels.
We present the results of our method on different benchmarks which have various challenges like partiality, topological noise and different connectivity.
arXiv Detail & Related papers (2020-10-23T21:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.