Multifidelity Reinforcement Learning with Control Variates
- URL: http://arxiv.org/abs/2206.05165v1
- Date: Fri, 10 Jun 2022 15:01:37 GMT
- Title: Multifidelity Reinforcement Learning with Control Variates
- Authors: Sami Khairy, Prasanna Balaprakash
- Abstract summary: In many computational science and engineering applications, the output of a system of interest corresponding to a given input can be queried at different levels of fidelity with different costs.
We study the reinforcement learning problem in the presence of multiple environments with different levels of fidelity for a given control task.
A multifidelity estimator that exploits the cross-correlations between the low- and high-fidelity returns is proposed to reduce the variance in the estimation of the state-action value function.
- Score: 3.2895195535353317
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many computational science and engineering applications, the output of a
system of interest corresponding to a given input can be queried at different
levels of fidelity with different costs. Typically, low-fidelity data is cheap
and abundant, while high-fidelity data is expensive and scarce. In this work we
study the reinforcement learning (RL) problem in the presence of multiple
environments with different levels of fidelity for a given control task. We
focus on improving the RL agent's performance with multifidelity data.
Specifically, a multifidelity estimator that exploits the cross-correlations
between the low- and high-fidelity returns is proposed to reduce the variance
in the estimation of the state-action value function. The proposed estimator,
which is based on the method of control variates, is used to design a
multifidelity Monte Carlo RL (MFMCRL) algorithm that improves the learning of
the agent in the high-fidelity environment. The impacts of variance reduction
on policy evaluation and policy improvement are theoretically analyzed by using
probability bounds. Our theoretical analysis and numerical experiments
demonstrate that for a finite budget of high-fidelity data samples, our
proposed MFMCRL agent attains superior performance compared with that of a
standard RL agent that uses only the high-fidelity environment data for
learning the optimal policy.
Related papers
- Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization [55.97310586039358]
Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality.
We propose a novel model-free diffusion-based online RL algorithm, Q-weighted Variational Policy Optimization (QVPO)
Specifically, we introduce the Q-weighted variational loss, which can be proved to be a tight lower bound of the policy objective in online RL under certain conditions.
We also develop an efficient behavior policy to enhance sample efficiency by reducing the variance of the diffusion policy during online interactions.
arXiv Detail & Related papers (2024-05-25T10:45:46Z) - Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL)
We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$.
The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z) - Multifidelity linear regression for scientific machine learning from scarce data [0.0]
We propose a new multifidelity training approach for scientific machine learning via linear regression.
We provide bias and variance analysis of our new estimators that guarantee the approach's accuracy and improved robustness to scarce high-fidelity data.
arXiv Detail & Related papers (2024-03-13T15:40:17Z) - Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling [19.60087366873302]
Multi-fidelity surrogate modeling aims to learn an accurate surrogate at the highest fidelity level.
Deep learning approaches utilize neural network based encoders and decoders to improve scalability.
We propose Multi-fidelity Residual Neural Processes (MFRNP), a novel multi-fidelity surrogate modeling framework.
arXiv Detail & Related papers (2024-02-29T04:40:25Z) - Multi-fidelity reinforcement learning framework for shape optimization [0.8258451067861933]
We introduce a controlled transfer learning framework that leverages a multi-fidelity simulation setting.
Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers.
Our results demonstrate this framework's applicability to other scientific DRL scenarios.
arXiv Detail & Related papers (2022-02-22T20:44:04Z) - Distributional Reinforcement Learning for Multi-Dimensional Reward
Functions [91.88969237680669]
We introduce Multi-Dimensional Distributional DQN (MD3QN) to model the joint return distribution from multiple reward sources.
As a by-product of joint distribution modeling, MD3QN can capture the randomness in returns for each source of reward.
In experiments, our method accurately models the joint return distribution in environments with richly correlated reward functions.
arXiv Detail & Related papers (2021-10-26T11:24:23Z) - Balancing Value Underestimation and Overestimation with Realistic
Actor-Critic [6.205681604290727]
This paper introduces a novel model-free algorithm, Realistic Actor-Critic(RAC), which can be incorporated with any off-policy RL algorithms to improve sample efficiency.
RAC employs Universal Value Function Approximators (UVFA) to simultaneously learn a policy family with the same neural network, each with different trade-offs between underestimation and overestimation.
We evaluate RAC on the MuJoCo benchmark, achieving 10x sample efficiency and 25% performance improvement on the most challenging Humanoid environment compared to SAC.
arXiv Detail & Related papers (2021-10-19T03:35:01Z) - Adaptive Reliability Analysis for Multi-fidelity Models using a
Collective Learning Strategy [6.368679897630892]
This study presents a new approach called adaptive multi-fidelity Gaussian process for reliability analysis (AMGPRA)
It is shown that the proposed method achieves similar or higher accuracy with reduced computational costs compared to state-of-the-art single and multi-fidelity methods.
A key application of AMGPRA is high-fidelity fragility modeling using complex and costly physics-based computational models.
arXiv Detail & Related papers (2021-09-21T14:42:58Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Policy Information Capacity: Information-Theoretic Measure for Task
Complexity in Deep Reinforcement Learning [83.66080019570461]
We propose two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty.
We show that these metrics have higher correlations with normalized task solvability scores than a variety of alternatives.
These metrics can also be used for fast and compute-efficient optimizations of key design parameters.
arXiv Detail & Related papers (2021-03-23T17:49:50Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.