A reinforcement learning strategy for p-adaptation in high order solvers
- URL: http://arxiv.org/abs/2306.08292v1
- Date: Wed, 14 Jun 2023 07:01:31 GMT
- Title: A reinforcement learning strategy for p-adaptation in high order solvers
- Authors: David Huergo, Gonzalo Rubio, Esteban Ferrer
- Abstract summary: Reinforcement learning (RL) has emerged as a promising approach to automating decision processes.
This paper explores the application of RL techniques to optimise the order in the computational mesh when using high-order solvers.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) has emerged as a promising approach to automating
decision processes. This paper explores the application of RL techniques to
optimise the polynomial order in the computational mesh when using high-order
solvers. Mesh adaptation plays a crucial role in improving the efficiency of
numerical simulations by improving accuracy while reducing the cost. Here,
actor-critic RL models based on Proximal Policy Optimization offer a
data-driven approach for agents to learn optimal mesh modifications based on
evolving conditions.
The paper provides a strategy for p-adaptation in high-order solvers and
includes insights into the main aspects of RL-based mesh adaptation, including
the formulation of appropriate reward structures and the interaction between
the RL agent and the simulation environment. We discuss the impact of RL-based
mesh p-adaptation on computational efficiency and accuracy. We test the RL
p-adaptation strategy on a 1D inviscid Burgers' equation to demonstrate the
effectiveness of the strategy. The RL strategy reduces the computational cost
and improves accuracy over uniform adaptation, while minimising human
intervention.
Related papers
- Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System [0.7499722271664147]
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a Quanser Aero 2 system.
PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability.
arXiv Detail & Related papers (2024-08-28T08:35:34Z) - Reinforcement learning for anisotropic p-adaptation and error estimation in high-order solvers [0.37109226820205005]
We present a novel approach to automate and optimize anisotropic p-adaptation in high-order h/p using Reinforcement Learning (RL)
We develop an offline training approach, decoupled from the main solver, which shows minimal overcost when performing simulations.
We derive an inexpensive RL-based error estimation approach that enables the quantification of local discretization errors.
arXiv Detail & Related papers (2024-07-26T17:55:23Z) - Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization [55.97310586039358]
Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality.
We propose a novel model-free diffusion-based online RL algorithm, Q-weighted Variational Policy Optimization (QVPO)
Specifically, we introduce the Q-weighted variational loss, which can be proved to be a tight lower bound of the policy objective in online RL under certain conditions.
We also develop an efficient behavior policy to enhance sample efficiency by reducing the variance of the diffusion policy during online interactions.
arXiv Detail & Related papers (2024-05-25T10:45:46Z) - Back to Basics: Revisiting REINFORCE Style Optimization for Learning
from Human Feedback in LLMs [29.505270680223003]
AI alignment in the shape of Reinforcement Learning from Human Feedback is increasingly treated as a crucial ingredient for high performance large language models.
Proximal Policy Optimization (PPO) has been positioned by recent literature as the canonical method for the RL part of RLHF.
We show that many components of PPO are unnecessary in an RLHF context and that simpler REINFORCE-style optimization variants outperform both PPO and newly proposed "RL-free" methods such as DPO and RAFT.
arXiv Detail & Related papers (2024-02-22T17:52:34Z) - Data-Efficient Task Generalization via Probabilistic Model-based Meta
Reinforcement Learning [58.575939354953526]
PACOH-RL is a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics.
Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robotics.
Our experiment results demonstrate that PACOH-RL outperforms model-based RL and model-based Meta-RL baselines in adapting to new dynamic conditions.
arXiv Detail & Related papers (2023-11-13T18:51:57Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio.
In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.