Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime
- URL: http://arxiv.org/abs/2504.12000v2
- Date: Fri, 29 Aug 2025 15:52:26 GMT
- Title: Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime
- Authors: Thorben Markmann, Michiel Straat, Sebastian Peitz, Barbara Hammer,
- Abstract summary: We study the effectiveness of Reinforcement Learning (RL) for reducing convective heat transfer under increasing turbulence.<n>RL agents trained via single-agent Proximal Policy Optimization (PPO) are compared to linear proportional derivative (PD) controllers.<n>The RL agents reduced convection, measured by the Nusselt Number, by up to 33% in moderately turbulent systems and 10% in highly turbulent settings.
- Score: 6.619254876970774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven flow control has significant potential for industry, energy systems, and climate science. In this work, we study the effectiveness of Reinforcement Learning (RL) for reducing convective heat transfer in the 2D Rayleigh-B\'enard Convection (RBC) system under increasing turbulence. We investigate the generalizability of control across varying initial conditions and turbulence levels and introduce a reward shaping technique to accelerate the training. RL agents trained via single-agent Proximal Policy Optimization (PPO) are compared to linear proportional derivative (PD) controllers from classical control theory. The RL agents reduced convection, measured by the Nusselt Number, by up to 33% in moderately turbulent systems and 10% in highly turbulent settings, clearly outperforming PD control in all settings. The agents showed strong generalization performance across different initial conditions and to a significant extent, generalized to higher degrees of turbulence. The reward shaping improved sample efficiency and consistently stabilized the Nusselt Number to higher turbulence levels.
Related papers
- Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning [88.42566960813438]
CalibRL is a hybrid-policy RLVR framework that supports controllable exploration with expert guidance.<n>CalibRL increases policy entropy in a guided manner and clarifies the target distribution.<n>Experiments across eight benchmarks, including both in-domain and out-of-domain settings, demonstrate consistent improvements.
arXiv Detail & Related papers (2026-02-22T07:23:36Z) - Making Tunable Parameters State-Dependent in Weather and Climate Models with Reinforcement Learning [0.5131152350448099]
This study presents a framework that learns components of parametrisation schemes online.<n>It evaluates the resulting RL-driven parameter updates across a hierarchy of idealised testbeds.<n>Results highlight RL to deliver skilful state-dependent, and regime-aware parametrisations.
arXiv Detail & Related papers (2026-01-07T11:19:16Z) - Trust-Region Adaptive Policy Optimization [82.09255251747818]
Post-training methods play an important role in improving large language models' (LLMs) complex reasoning abilities.<n>We introduce TRAPO, a framework that interleavesSupervised Fine-Tuning (SFT) and Reinforcement Learning (RL) within each training instance.<n>Experiments on five mathematical reasoning benchmarks show that TRAPO consistently surpasses standard SFT, RL, and SFT-then-RL pipelines.
arXiv Detail & Related papers (2025-12-19T14:37:07Z) - Improving the Robustness of Control of Chaotic Convective Flows with Domain-Informed Reinforcement Learning [6.619254876970774]
Chaotic convective flows arise in many real-world systems, such as microfluidic devices and chemical reactors.<n>In this work, we improve the practical feasibility of RL-based control of such flows focusing on Rayleigh-B'enard Convection.<n>We incorporate domain knowledge in the reward function via a term that encourages B'enard cell merging, as an example of a desirable macroscopic property.<n>Our results show that the domain-informed reward design results in steady flows, faster convergence during training, and generalization across flow regimes without retraining.
arXiv Detail & Related papers (2025-10-31T21:45:40Z) - Physics-informed Neural-operator Predictive Control for Drag Reduction in Turbulent Flows [109.99020160824553]
We propose an efficient deep reinforcement learning framework for modeling and control of turbulent flows.<n>It is model-based RL for predictive control (PC), where both the policy and the observer models for turbulence control are learned jointly.<n>We find that PINO-PC achieves a drag reduction of 39.0% under a bulk-velocity Reynolds number of 15,000, outperforming previous fluid control methods by more than 32%.
arXiv Detail & Related papers (2025-10-03T00:18:26Z) - Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle [53.239242017802056]
Reinforcement learning (RL) has emerged as an effective post-training paradigm for enhancing the reasoning capabilities of multimodal large language model (MLLM)<n>However, current RL pipelines often suffer from training inefficiencies caused by two underexplored issues: Advantage Collapsing and Rollout Silencing.<n>We propose Shuffle-R1, a simple yet principled framework that improves RL fine-tuning efficiency by dynamically restructuring trajectory sampling and batch composition.
arXiv Detail & Related papers (2025-08-07T17:53:47Z) - Diffusion Guidance Is a Controllable Policy Improvement Operator [98.11511661904618]
CFGRL is trained with the simplicity of supervised learning, yet can further improve on the policies in the data.<n>On offline RL tasks, we observe a reliable trend -- increased guidance weighting leads to increased performance.
arXiv Detail & Related papers (2025-05-29T14:06:50Z) - Transfer learning-enhanced deep reinforcement learning for aerodynamic airfoil optimisation subject to structural constraints [1.5468177185307304]
This paper introduces a transfer learning-enhanced deep reinforcement learning (DRL) methodology that is able to optimise the geometry of any airfoil.<n>To showcase the method, we aim to maximise the lift-to-drag ratio $C_L/C_D$ while preserving the structural integrity of the airfoil.<n>The performance of the DRL agent is compared with Particle Swarm optimisation (PSO), a traditional gradient-free method.
arXiv Detail & Related papers (2025-05-05T13:26:11Z) - Nuclear Microreactor Control with Deep Reinforcement Learning [0.40498500266986387]
This study explores the application of deep reinforcement learning (RL) for real-time drum control in microreactors.<n>RL controllers can achieve similar or even superior load-following performance as traditional proportional-integral-derivative (PID) controllers.
arXiv Detail & Related papers (2025-03-31T19:11:19Z) - Improving a Proportional Integral Controller with Reinforcement Learning on a Throttle Valve Benchmark [2.8322124733515666]
This paper presents a learning-based control strategy for non-linear throttle valves with an asymmetric controller.
We exploit the recent advances in Reinforcement Learning with Guides to improve the closed-loop behavior by learning from the additional interactions with the valve.
In all the experimental test cases, the resulting agent has a better sample efficiency than traditional RL agents and outperforms the PI controller.
arXiv Detail & Related papers (2024-02-21T09:40:26Z) - Reinforcement learning to maximise wind turbine energy generation [0.8437187555622164]
We propose a reinforcement learning strategy to control wind turbine energy generation by actively changing the rotor speed, the rotor yaw angle and the blade pitch angle.
A double deep Q-learning with a prioritized experience replay agent is coupled with a blade element momentum model and is trained to allow control for changing winds.
The agent is trained to decide the best control (speed, yaw, pitch) for simple steady winds and is subsequently challenged with real dynamic turbulent winds, showing good performance.
arXiv Detail & Related papers (2024-02-17T21:35:13Z) - An experimental evaluation of Deep Reinforcement Learning algorithms for HVAC control [40.71019623757305]
Recent studies have shown that Deep Reinforcement Learning (DRL) algorithms can outperform traditional reactive controllers.
This paper provides a critical and reproducible evaluation of several state-of-the-art DRL algorithms for HVAC control.
arXiv Detail & Related papers (2024-01-11T08:40:26Z) - Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning
and Autoregression [70.78523583702209]
We study training instabilities of behavior cloning with deep neural networks.
We observe that minibatch SGD updates to the policy network during training result in sharp oscillations in long-horizon rewards.
arXiv Detail & Related papers (2023-10-17T17:39:40Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Surrogate Empowered Sim2Real Transfer of Deep Reinforcement Learning for
ORC Superheat Control [12.567922037611261]
This paper proposes a Sim2Real transfer learning-based DRL control method for ORC superheat control.
Experimental results show that the proposed method greatly improves the training speed of DRL in ORC control problems.
arXiv Detail & Related papers (2023-08-05T01:59:44Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Understanding the Difficulty of Training Transformers [120.99980924577787]
We show that unbalanced gradients are not the root cause of the instability of training.
We propose Admin to stabilize the early stage's training and unleash its full potential in the late stage.
arXiv Detail & Related papers (2020-04-17T13:59:07Z) - Controlling Rayleigh-B\'enard convection via Reinforcement Learning [62.997667081978825]
The identification of effective control strategies to suppress or enhance the convective heat exchange under fixed external thermal gradients is an outstanding fundamental and technological issue.
In this work, we explore a novel approach, based on a state-of-the-art Reinforcement Learning (RL) algorithm.
We show that our RL-based control is able to stabilize the conductive regime and bring the onset of convection up to a Rayleigh number.
arXiv Detail & Related papers (2020-03-31T16:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.