WeatherBench: A benchmark dataset for data-driven weather forecasting
- URL: http://arxiv.org/abs/2002.00469v3
- Date: Thu, 11 Jun 2020 19:13:22 GMT
- Title: WeatherBench: A benchmark dataset for data-driven weather forecasting
- Authors: Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn,
Soukayna Mouatadid, Nils Thuerey
- Abstract summary: We present a benchmark dataset for data-driven medium-range weather forecasting.
We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models.
We provide baseline scores from simple linear regression techniques, deep learning models, as well as purely physical forecasting models.
- Score: 17.76377510880905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven approaches, most prominently deep learning, have become powerful
tools for prediction in many domains. A natural question to ask is whether
data-driven methods could also be used to predict global weather patterns days
in advance. First studies show promise but the lack of a common dataset and
evaluation metrics make inter-comparison between studies difficult. Here we
present a benchmark dataset for data-driven medium-range weather forecasting, a
topic of high scientific interest for atmospheric and computer scientists
alike. We provide data derived from the ERA5 archive that has been processed to
facilitate the use in machine learning models. We propose simple and clear
evaluation metrics which will enable a direct comparison between different
methods. Further, we provide baseline scores from simple linear regression
techniques, deep learning models, as well as purely physical forecasting
models. The dataset is publicly available at
https://github.com/pangeo-data/WeatherBench and the companion code is
reproducible with tutorials for getting started. We hope that this dataset will
accelerate research in data-driven weather forecasting.
Related papers
- Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - WeatherReal: A Benchmark Based on In-Situ Observations for Evaluating Weather Models [11.016845506758841]
We introduce WeatherReal, a novel benchmark dataset for weather forecasting derived from global near-surface in-situ observations.
This paper details the sources and processing methodologies underlying the dataset, and illustrates the advantage of in-situ observations in capturing hyper-local and extreme weather.
Our work aims to advance the AI-based weather forecasting research towards a more application-focused and operation-ready approach.
arXiv Detail & Related papers (2024-09-14T08:53:46Z) - Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling [55.13352174687475]
This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which Generalizes weather forecasts to Finer-grained Temporal scales.
Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale.
We introduce a lead time-aware training framework to promote the generalization of the model at different lead times.
arXiv Detail & Related papers (2024-05-22T16:21:02Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - GraphCast: Learning skillful medium-range global weather forecasting [107.40054095223779]
We introduce a machine learning-based method called "GraphCast", which can be trained directly from reanalysis data.
It predicts hundreds of weather variables, over 10 days at 0.25 degree resolution globally, in under one minute.
We show that GraphCast significantly outperforms the most accurate operational deterministic systems on 90% of 1380 verification targets.
arXiv Detail & Related papers (2022-12-24T18:15:39Z) - Physics Informed Shallow Machine Learning for Wind Speed Prediction [66.05661813632568]
We analyze a massive dataset of wind measured from anemometers located at 10 m height in 32 locations in Italy.
We train supervised learning algorithms using the past history of wind to predict its value at a future time.
We find that the optimal design as well as its performance vary with the location.
arXiv Detail & Related papers (2022-04-01T14:55:10Z) - SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and
Benchmarking [20.442879707675115]
SubseasonalClimateUSA is a curated dataset for training and benchmarking subseasonal forecasting models in the United States.
We use this dataset to benchmark a diverse suite of models, including operational dynamical models, classical meteorological baselines, and ten state-of-the-art machine learning and deep learning-based methods from the literature.
arXiv Detail & Related papers (2021-09-21T18:42:10Z) - RainBench: Towards Global Precipitation Forecasting from Satellite
Imagery [6.462260770989231]
Extreme precipitation events routinely ravage economies and livelihoods around the developing world.
Data-driven deep learning approaches could widen the access to accurate multi-day forecasts.
There is currently no benchmark dataset dedicated to the study of global precipitation forecasts.
arXiv Detail & Related papers (2020-12-17T15:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.