Model-free Representation Learning and Exploration in Low-rank MDPs
- URL: http://arxiv.org/abs/2102.07035v1
- Date: Sun, 14 Feb 2021 00:06:54 GMT
- Title: Model-free Representation Learning and Exploration in Low-rank MDPs
- Authors: Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh
Agarwal
- Abstract summary: We present the first model-free representation learning algorithms for low rank MDPs.
Key algorithmic contribution is a new minimax representation learning objective.
Result can accommodate general function approximation to scale to complex environments.
- Score: 64.72023662543363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The low rank MDP has emerged as an important model for studying
representation learning and exploration in reinforcement learning. With a known
representation, several model-free exploration strategies exist. In contrast,
all algorithms for the unknown representation setting are model-based, thereby
requiring the ability to model the full dynamics. In this work, we present the
first model-free representation learning algorithms for low rank MDPs. The key
algorithmic contribution is a new minimax representation learning objective,
for which we provide variants with differing tradeoffs in their statistical and
computational properties. We interleave this representation learning step with
an exploration strategy to cover the state space in a reward-free manner. The
resulting algorithms are provably sample efficient and can accommodate general
function approximation to scale to complex environments.
Related papers
- Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Graph-based State Representation for Deep Reinforcement Learning [1.5901689240516976]
We exploit the fact that the underlying Markov decision process (MDP) represents a graph, which enables us to incorporate the topological information for effective state representation learning.
Motivated by the recent success of node representations for several graph analytical tasks we specifically investigate the capability of node representation learning methods to effectively encode the topology of the underlying MDP in Deep RL.
We find that all embedding methods outperform the commonly used matrix representation of grid-world environments in all of the studied cases.
arXiv Detail & Related papers (2020-04-29T05:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.