OceanForecastBench: A Benchmark Dataset for Data-Driven Global Ocean Forecasting
- URL: http://arxiv.org/abs/2511.18732v1
- Date: Mon, 24 Nov 2025 03:57:43 GMT
- Title: OceanForecastBench: A Benchmark Dataset for Data-Driven Global Ocean Forecasting
- Authors: Haoming Jia, Yi Han, Xiang Wang, Huizan Wang, Wei Wu, Jianming Zheng, Peikun Xiao,
- Abstract summary: Data-driven deep learning-based ocean forecast models have demonstrated significant potential in capturing complex ocean dynamics.<n>Despite these advancements, the absence of open-source benchmarks has led to inconsistent data usage and evaluation methods.<n>We propose OceanForecastBench, a benchmark offering three core contributions.
- Score: 10.30935627811107
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Global ocean forecasting aims to predict key ocean variables such as temperature, salinity, and currents, which is essential for understanding and describing oceanic phenomena. In recent years, data-driven deep learning-based ocean forecast models, such as XiHe, WenHai, LangYa and AI-GOMS, have demonstrated significant potential in capturing complex ocean dynamics and improving forecasting efficiency. Despite these advancements, the absence of open-source, standardized benchmarks has led to inconsistent data usage and evaluation methods. This gap hinders efficient model development, impedes fair performance comparison, and constrains interdisciplinary collaboration. To address this challenge, we propose OceanForecastBench, a benchmark offering three core contributions: (1) A high-quality global ocean reanalysis data over 28 years for model training, including 4 ocean variables across 23 depth levels and 4 sea surface variables. (2) A high-reliability satellite and in-situ observations for model evaluation, covering approximately 100 million locations in the global ocean. (3) An evaluation pipeline and a comprehensive benchmark with 6 typical baseline models, leveraging observations to evaluate model performance from multiple perspectives. OceanForecastBench represents the most comprehensive benchmarking framework currently available for data-driven ocean forecasting, offering an open-source platform for model development, evaluation, and comparison. The dataset and code are publicly available at: https://github.com/Ocean-Intelligent-Forecasting/OceanForecastBench.
Related papers
- Eddy-Resolving Global Ocean Forecasting with Multi-Scale Graph Neural Networks [0.0]
This study proposes a multi-scale graph neural network-based ocean model for 10-day global forecasting.<n>The model employs an encoder-processor-decoder architecture and uses two spherical meshes with different resolutions to better capture the multi-scale nature of ocean dynamics.
arXiv Detail & Related papers (2026-01-19T07:11:08Z) - OceanSAR-2: A Universal Feature Extractor for SAR Ocean Observation [55.978228064498865]
We present OceanSAR-2, the second generation of our foundation model for SAR-based ocean observation.<n>Building on our earlier release, which pioneered self-supervised learning on Sentinel-1 Wave Mode data, OceanSAR-2 relies on improved SSL training and dynamic data curation strategies.
arXiv Detail & Related papers (2026-01-12T10:20:43Z) - Neural ocean forecasting from sparse satellite-derived observations: a case-study for SSH dynamics and altimetry data [25.95895236084694]
We present an end-to-end deep learning framework for short-term forecasting of global sea surface dynamics based on sparse satellite altimetry data.<n>Our framework is developed within the OceanBench initiative, promoting standardized evaluation in ocean machine learning.
arXiv Detail & Related papers (2025-12-15T11:28:03Z) - Deep learning-based object detection of offshore platforms on Sentinel-1 Imagery and the impact of synthetic training data [0.3823356975862005]
Development of robust models for offshore infrastructure detection relies on comprehensive, balanced datasets.<n>By training deep learning-based YOLOv10 object detection models with a combination of synthetic and real Sentinel-1 satellite imagery, this study investigates the use of synthetic data to enhance model performance.<n>In total, 3,529 offshore platforms were detected, including 411 in the North Sea, 1,519 in the Gulf of Mexico, and 1,593 in the Persian Gulf.
arXiv Detail & Related papers (2025-11-06T12:13:53Z) - Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection [54.1960918379255]
Neptune-X is a data-centric generative-selection framework for maritime object detection.<n>X-to-Maritime is a multi-modality-conditioned generative model that synthesizes diverse and realistic maritime scenes.<n>Our approach sets a new benchmark in maritime scene synthesis, significantly improving detection accuracy.
arXiv Detail & Related papers (2025-09-25T04:59:02Z) - MedFormer: a data-driven model for forecasting the Mediterranean Sea [26.196897593411332]
We present MedFormer, a fully data-driven deep learning model for medium-range ocean forecasting in the Mediterranean Sea.<n>The model is trained on 20 years of daily ocean reanalysis data and fine-tuned with high-resolution operational analyses.<n>We benchmark MedFormer against the state-of-the-art Mediterranean Forecasting System (MedFS), developed at Euro-Mediterranean Center on Climate Change.
arXiv Detail & Related papers (2025-08-16T09:05:03Z) - OKG-LLM: Aligning Ocean Knowledge Graph with Observation Data via LLMs for Global Sea Surface Temperature Prediction [70.48962924608033]
This work presents the first systematic effort to construct an Ocean Knowledge Graph (OKG) specifically designed to represent diverse ocean knowledge for SST prediction.<n>We develop a graph embedding network to learn the comprehensive semantic and structural knowledge within the OKG, capturing both the unique characteristics of individual sea regions and the complex correlations between them. Finally, we align the learned knowledge with fine-grained numerical SST data and leverage a pre-trained LLM to model SST patterns for accurate prediction.
arXiv Detail & Related papers (2025-07-31T02:06:03Z) - FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution [9.106159985605009]
FuXi-Ocean is the first data-driven global ocean forecasting model achieving six-hourly predictions at eddy-resolving 1/12deg spatial resolution.<n>The model architecture integrates a context-aware feature extraction module with a predictive network employing stacked attention blocks.<n>FuXi-Ocean demonstrates superior skill in predicting key variables, including temperature, salinity, and currents, across multiple depths.
arXiv Detail & Related papers (2025-06-03T00:52:31Z) - Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development.<n>This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - Data-driven Global Ocean Modeling for Seasonal to Decadal Prediction [39.7461632644892]
ORCA-DL is the first data-driven 3D ocean model for seasonal to decadal prediction of global ocean circulation.
It accurately simulates three-dimensional ocean dynamics and outperforms state-of-the-art dynamical models.
It stably emulates ocean dynamics at decadal timescales, demonstrating its potential even for skillful decadal predictions and climate projections.
arXiv Detail & Related papers (2024-05-24T10:23:17Z) - Deep Learning for Day Forecasts from Sparse Observations [60.041805328514876]
Deep neural networks offer an alternative paradigm for modeling weather conditions.
MetNet-3 learns from both dense and sparse data sensors and makes predictions up to 24 hours ahead for precipitation, wind, temperature and dew point.
MetNet-3 has a high temporal and spatial resolution, respectively, up to 2 minutes and 1 km as well as a low operational latency.
arXiv Detail & Related papers (2023-06-06T07:07:54Z) - Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global
Weather Forecast [91.9372563527801]
We present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
For the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy.
Pangu-Weather supports a wide range of downstream forecast scenarios, including extreme weather forecast and large-member ensemble forecast in real-time.
arXiv Detail & Related papers (2022-11-03T17:19:43Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.