Related papers: WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics

WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics

URL: http://arxiv.org/abs/2602.01379v1
Date: Sun, 01 Feb 2026 18:27:10 GMT
Title: WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics
Authors: Zachary Cooper-Baldock, Paulo E. Santos, Russell S. A. Brinkworth, Karl Sammut,
Abstract summary: This paper introduces WAKESET, a novel, large-scale CFD dataset of highly turbulent flows.<n>The dataset captures the complex hydrodynamic interactions during the underwater recovery of an autonomous underwater vehicle.<n>It comprises 1,091 high-fidelity Reynolds-Averaged Navier-Stokes simulations, augmented to 4,364 instances, covering a wide operational envelope of speeds.
Score: 0.7466390172678969
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning (ML) offers transformative potential for computational fluid dynamics (CFD), promising to accelerate simulations, improve turbulence modelling, and enable real-time flow prediction and control-capabilities that could fundamentally change how engineers approach fluid dynamics problems. However, the exploration of ML in fluid dynamics is critically hampered by the scarcity of large, diverse, and high-fidelity datasets suitable for training robust models. This limitation is particularly acute for highly turbulent flows, which dominate practical engineering applications yet remain computationally prohibitive to simulate at scale. High-Reynolds number turbulent datasets are essential for ML models to learn the complex, multi-scale physics characteristic of real-world flows, enabling generalisation beyond the simplified, low-Reynolds number regimes often represented in existing datasets. This paper introduces WAKESET, a novel, large-scale CFD dataset of highly turbulent flows, designed to address this critical gap. The dataset captures the complex hydrodynamic interactions during the underwater recovery of an autonomous underwater vehicle by a larger extra-large uncrewed underwater vehicle. It comprises 1,091 high-fidelity Reynolds-Averaged Navier-Stokes simulations, augmented to 4,364 instances, covering a wide operational envelope of speeds (up to Reynolds numbers of 1.09 x 10^8) and turning angles. This work details the motivation for this new dataset by reviewing existing resources, outlines the hydrodynamic modelling and validation underpinning its creation, and describes its structure. The dataset's focus on a practical engineering problem, its scale, and its high turbulence characteristics make it a valuable resource for developing and benchmarking ML models for flow field prediction, surrogate modelling, and autonomous navigation in complex underwater environments.

Related papers

Differentiable multiphase flow model for physics-informed machine learning in reservoir pressure management [44.41703936689344]
We introduce a physics-informed machine learning workflow that couples a fully differentiable multiphase flow simulator.<n>The CNN learns to predict fluid extraction rates from heterogeneous permeability fields to enforce pressure limits at critical reservoir locations.<n>We demonstrate that high-accuracy training can be achieved with fewer than three thousand full-physics multiphase flow simulations.
arXiv Detail & Related papers (2025-08-26T20:38:02Z)
AI-Enhanced Automatic Design of Efficient Underwater Gliders [60.45821679800442]
Building an automated design framework is challenging due to the complexities of representing glider shapes and the high computational costs associated with modeling complex solid-fluid interactions.<n>We introduce an AI-enhanced automated computational framework designed to overcome these limitations by enabling the creation of underwater robots with non-trivial hull shapes.<n>Our approach involves an algorithm that co-optimizes both shape and control signals, utilizing a reduced-order geometry representation and a differentiable neural-network-based fluid surrogate model.
arXiv Detail & Related papers (2025-04-30T23:55:44Z)
Learning Effective Dynamics across Spatio-Temporal Scales of Complex Flows [4.798951413107239]
We propose a novel framework, Graph-based Learning of Effective Dynamics (Graph-LED), that leverages graph neural networks (GNNs) and an attention-based autoregressive model.<n>We evaluate the proposed approach on a suite of fluid dynamics problems, including flow past a cylinder and flow over a backward-facing step over a range of Reynolds numbers.
arXiv Detail & Related papers (2025-02-11T22:14:30Z)
Physically Interpretable Representation and Controlled Generation for Turbulence Data [39.42376941186934]
This paper proposes a data-driven approach to encode high-dimensional scientific data into low-dimensional, physically meaningful representations.<n>We validate our approach using 2D Navier-Stokes simulations of flow past a cylinder over a range of Reynolds numbers.
arXiv Detail & Related papers (2025-01-31T17:51:14Z)
Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model [49.06911227670408]
We show that SciML foundation model can significantly improve the data efficiency of inferring real-world 3D fluid dynamics with improved generalization.<n>We equip neural fluid fields with a novel collaborative training approach that utilizes augmented views and fluid features extracted by our foundation model.
arXiv Detail & Related papers (2024-12-18T14:39:43Z)
Physics-enhanced Neural Operator for Simulating Turbulent Transport [9.923888452768919]
This paper presents a physics-enhanced neural operator (PENO) that incorporates physical knowledge of partial differential equations (PDEs) to accurately model flow dynamics. The proposed method is evaluated through its performance on two distinct sets of 3D turbulent flow data.
arXiv Detail & Related papers (2024-05-31T20:05:17Z)
From Zero to Turbulence: Generative Modeling for 3D Flow Simulation [45.626346087828765]
We propose to approach turbulent flow simulation as a generative task directly learning the manifold of all possible turbulent flow states without relying on any initial flow state. Our generative model captures the distribution of turbulent flows caused by unseen objects and generates high-quality, realistic samples for downstream applications.
arXiv Detail & Related papers (2023-05-29T18:20:28Z)
Learning Large-scale Subsurface Simulations with a Hybrid Graph Network Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows. HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure. Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z)
Physics-Inspired Temporal Learning of Quadrotor Dynamics for Accurate Model Predictive Trajectory Tracking [76.27433308688592]
Accurately modeling quadrotor's system dynamics is critical for guaranteeing agile, safe, and stable navigation. We present a novel Physics-Inspired Temporal Convolutional Network (PI-TCN) approach to learning quadrotor's system dynamics purely from robot experience. Our approach combines the expressive power of sparse temporal convolutions and dense feed-forward connections to make accurate system predictions.
arXiv Detail & Related papers (2022-06-07T13:51:35Z)
Scientific multi-agent reinforcement learning for wall-models of turbulent flows [5.678337324555036]
We introduce scientific multi-agent reinforcement learning (SciMARL) for the discovery of wall models for large-eddy simulations. The present simulations reduce by several orders of magnitude the computational cost over fully-resolved simulations.
arXiv Detail & Related papers (2021-06-21T14:30:10Z)
Machine learning for rapid discovery of laminar flow channel wall modifications that enhance heat transfer [56.34005280792013]
We present a combination of accurate numerical simulations of arbitrary, flat, and non-flat channels and machine learning models predicting drag coefficient and Stanton number. We show that convolutional neural networks (CNN) can accurately predict the target properties at a fraction of the time of numerical simulations.
arXiv Detail & Related papers (2021-01-19T16:14:02Z)
Automating Turbulence Modeling by Multi-Agent Reinforcement Learning [4.784658158364452]
We introduce multi-agent reinforcement learning as an automated discovery tool of turbulence models. We demonstrate the potential of this approach on Large Eddy Simulations of homogeneous and isotropic turbulence.
arXiv Detail & Related papers (2020-05-18T18:45:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.