Improving the Robustness of Control of Chaotic Convective Flows with Domain-Informed Reinforcement Learning
- URL: http://arxiv.org/abs/2511.00272v1
- Date: Fri, 31 Oct 2025 21:45:40 GMT
- Title: Improving the Robustness of Control of Chaotic Convective Flows with Domain-Informed Reinforcement Learning
- Authors: Michiel Straat, Thorben Markmann, Sebastian Peitz, Barbara Hammer,
- Abstract summary: Chaotic convective flows arise in many real-world systems, such as microfluidic devices and chemical reactors.<n>In this work, we improve the practical feasibility of RL-based control of such flows focusing on Rayleigh-B'enard Convection.<n>We incorporate domain knowledge in the reward function via a term that encourages B'enard cell merging, as an example of a desirable macroscopic property.<n>Our results show that the domain-informed reward design results in steady flows, faster convergence during training, and generalization across flow regimes without retraining.
- Score: 6.619254876970774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chaotic convective flows arise in many real-world systems, such as microfluidic devices and chemical reactors. Stabilizing these flows is highly desirable but remains challenging, particularly in chaotic regimes where conventional control methods often fail. Reinforcement Learning (RL) has shown promise for control in laminar flow settings, but its ability to generalize and remain robust under chaotic and turbulent dynamics is not well explored, despite being critical for real-world deployment. In this work, we improve the practical feasibility of RL-based control of such flows focusing on Rayleigh-B\'enard Convection (RBC), a canonical model for convective heat transport. To enhance generalization and sample efficiency, we introduce domain-informed RL agents that are trained using Proximal Policy Optimization across diverse initial conditions and flow regimes. We incorporate domain knowledge in the reward function via a term that encourages B\'enard cell merging, as an example of a desirable macroscopic property. In laminar flow regimes, the domain-informed RL agents reduce convective heat transport by up to 33%, and in chaotic flow regimes, they still achieve a 10% reduction, which is significantly better than the conventional controllers used in practice. We compare the domain-informed to uninformed agents: Our results show that the domain-informed reward design results in steady flows, faster convergence during training, and generalization across flow regimes without retraining. Our work demonstrates that elegant domain-informed priors can greatly enhance the robustness of RL-based control of chaotic flows, bringing real-world deployment closer.
Related papers
- Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving [22.3805998088591]
DACER-F is a flow matching algorithm for generative policies in autonomous driving systems.<n>It achieves a score of 775.8 on the humanoid-stand task and surpassing prior methods.
arXiv Detail & Related papers (2026-03-03T05:35:53Z) - Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning [88.42566960813438]
CalibRL is a hybrid-policy RLVR framework that supports controllable exploration with expert guidance.<n>CalibRL increases policy entropy in a guided manner and clarifies the target distribution.<n>Experiments across eight benchmarks, including both in-domain and out-of-domain settings, demonstrate consistent improvements.
arXiv Detail & Related papers (2026-02-22T07:23:36Z) - DFPO: Scaling Value Modeling via Distributional Flow towards Robust and Generalizable LLM Post-Training [94.568675548967]
Training reinforcement learning (RL) systems in real-world environments remains challenging due to noisy supervision and poor out-of-domain generalization.<n>Recent distributional RL methods improve robustness by modeling values with multiple quantile points, but they still learn each quantile independently as a scalar.<n>We propose DFPO, a robust distributional RL framework that models values as continuous flows across time steps.
arXiv Detail & Related papers (2026-02-05T17:07:42Z) - Boosting Fidelity for Pre-Trained-Diffusion-Based Low-Light Image Enhancement via Condition Refinement [63.54516423266521]
Pre-Trained Diffusion-Based (PTDB) methods often sacrifice content fidelity to attain higher perceptual realism.<n>We propose a novel optimization strategy for conditioning in pre-trained diffusion models, enhancing fidelity while preserving realism and aesthetics.<n>Our approach is plug-and-play, seamlessly integrating into existing diffusion networks to provide more effective control.
arXiv Detail & Related papers (2025-10-20T02:40:06Z) - Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning [56.47948583452555]
We introduce the Stepwise Flow Policy (SWFP) framework, founded on the key insight that discretizing the flow matching inference process via a fixed-step Euler scheme aligns it with the variational Jordan-Kinderlehrer-Otto principle from optimal transport.<n>SWFP decomposes the global flow into a sequence of small, incremental transformations between proximate distributions.<n>This decomposition yields an efficient algorithm that fine-tunes pre-trained flows via a cascade of small flow blocks, offering significant advantages.
arXiv Detail & Related papers (2025-10-17T07:43:51Z) - Efficient Regression-Based Training of Normalizing Flows for Boltzmann Generators [85.25962679349551]
Boltzmann Generators (BGs) offer efficient sampling and likelihoods, but their training via maximum likelihood is often unstable and computationally challenging.<n>We propose Regression Training of Normalizing Flows (RegFlow), a novel and scalable-based training objective that bypasses the numerical instability and computational challenge of conventional maximum likelihood training.
arXiv Detail & Related papers (2025-06-01T20:32:27Z) - Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime [6.619254876970774]
We study the effectiveness of Reinforcement Learning (RL) for reducing convective heat transfer under increasing turbulence.<n>RL agents trained via single-agent Proximal Policy Optimization (PPO) are compared to linear proportional derivative (PD) controllers.<n>The RL agents reduced convection, measured by the Nusselt Number, by up to 33% in moderately turbulent systems and 10% in highly turbulent settings.
arXiv Detail & Related papers (2025-04-16T11:51:59Z) - Invariant Control Strategies for Active Flow Control using Graph Neural Networks [0.0]
We introduce graph neural networks (GNNs) as a promising architecture forReinforcement Learning (RL)-based flow control.<n>GNNs process unstructured, threedimensional flow data, preserving spatial relationships without the constraints of a Cartesian grid.<n>We show that GNN-based control policies achieve comparable performance to existing methods while benefiting from improved generalization properties.
arXiv Detail & Related papers (2025-03-28T09:33:40Z) - Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization [14.320131946691268]
We propose an easy-to-use and theoretically sound fine-tuning method for flow-based generative models.<n>By introducing an online rewardweighting mechanism, our approach guides the model to prioritize high-reward regions in the data manifold.<n>Our method achieves optimal policy convergence while allowing controllable trade-offs between reward and diversity.
arXiv Detail & Related papers (2025-02-09T22:45:15Z) - FlowIE: Efficient Image Enhancement via Rectified Flow [71.6345505427213]
FlowIE is a flow-based framework that estimates straight-line paths from an elementary distribution to high-quality images.
Our contributions are rigorously validated through comprehensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-06-01T17:29:29Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Improving and generalizing flow-based generative models with minibatch
optimal transport [90.01613198337833]
We introduce the generalized conditional flow matching (CFM) technique for continuous normalizing flows (CNFs)
CFM features a stable regression objective like that used to train the flow in diffusion models but enjoys the efficient inference of deterministic flow models.
A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference.
arXiv Detail & Related papers (2023-02-01T14:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.