Efficient Reinforcement Learning in Block MDPs: A Model-free
Representation Learning Approach
- URL: http://arxiv.org/abs/2202.00063v2
- Date: Wed, 2 Feb 2022 19:21:15 GMT
- Title: Efficient Reinforcement Learning in Block MDPs: A Model-free
Representation Learning Approach
- Authors: Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh
Agarwal, Wen Sun
- Abstract summary: We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics.
BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy.
We show that BRIEE is more sample efficient than the state-of-art Block MDP algorithm HOMER RL and other empirical baselines on challenging rich-observation combination lock problems.
- Score: 73.62265030773652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present BRIEE (Block-structured Representation learning with Interleaved
Explore Exploit), an algorithm for efficient reinforcement learning in Markov
Decision Processes with block-structured dynamics (i.e., Block MDPs), where
rich observations are generated from a set of unknown latent states. BRIEE
interleaves latent states discovery, exploration, and exploitation together,
and can provably learn a near-optimal policy with sample complexity scaling
polynomially in the number of latent states, actions, and the time horizon,
with no dependence on the size of the potentially infinite observation space.
Empirically, we show that BRIEE is more sample efficient than the state-of-art
Block MDP algorithm HOMER and other empirical RL baselines on challenging
rich-observation combination lock problems that require deep exploration.
Related papers
- Block Sparse Bayesian Learning: A Diversified Scheme [16.61484758008309]
We introduce a novel prior called Diversified Block Sparse Prior to characterize the widespread block sparsity phenomenon in real-world data.
By allowing diversification on intra-block variance and inter-block correlation matrices, we effectively address the sensitivity issue of existing block sparse learning methods to pre-defined block information.
arXiv Detail & Related papers (2024-02-07T08:18:06Z) - Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic
Manipulation Tasks [12.27904219271791]
Current reinforcement learning algorithms struggle in sparse and complex environments.
We propose the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework.
arXiv Detail & Related papers (2023-09-28T11:14:52Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Provable RL with Exogenous Distractors via Multistep Inverse Dynamics [85.52408288789164]
Real-world applications of reinforcement learning (RL) require the agent to deal with high-dimensional observations such as those generated from a megapixel camera.
Prior work has addressed such problems with representation learning, through which the agent can provably extract endogenous, latent state information from raw observations.
However, such approaches can fail in the presence of temporally correlated noise in the observations.
arXiv Detail & Related papers (2021-10-17T15:21:27Z) - Sample-Efficient Reinforcement Learning of Undercomplete POMDPs [91.40308354344505]
This work shows that these hardness barriers do not preclude efficient reinforcement learning for rich and interesting subclasses of Partially Observable Decision Processes (POMDPs)
We present a sample-efficient algorithm, OOM-UCB, for episodic finite undercomplete POMDPs, where the number of observations is larger than the number of latent states and where exploration is essential for learning, thus distinguishing our results from prior works.
arXiv Detail & Related papers (2020-06-22T17:58:54Z) - Provably Efficient Exploration for Reinforcement Learning Using
Unsupervised Learning [96.78504087416654]
Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems, we investigate when this paradigm is provably efficient.
We present a general algorithmic framework that is built upon two components: an unsupervised learning algorithm and a noregret tabular RL algorithm.
arXiv Detail & Related papers (2020-03-15T19:23:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.