Randomized Value Functions via Posterior State-Abstraction Sampling
- URL: http://arxiv.org/abs/2010.02383v2
- Date: Thu, 17 Jun 2021 17:33:59 GMT
- Title: Randomized Value Functions via Posterior State-Abstraction Sampling
- Authors: Dilip Arumugam and Benjamin Van Roy
- Abstract summary: We argue that an agent seeking out latent task structure must explicitly represent and maintain its uncertainty over that structure.
We introduce a practical algorithm for doing this using two posterior distributions over state abstractions and abstract-state values.
In empirically validating our approach, we find that substantial performance gains lie in the multi-task setting.
- Score: 21.931580762349096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State abstraction has been an essential tool for dramatically improving the
sample efficiency of reinforcement-learning algorithms. Indeed, by exposing and
accentuating various types of latent structure within the environment,
different classes of state abstraction have enabled improved theoretical
guarantees and empirical performance. When dealing with state abstractions that
capture structure in the value function, however, a standard assumption is that
the true abstraction has been supplied or unrealistically computed a priori,
leaving open the question of how to efficiently uncover such latent structure
while jointly seeking out optimal behavior. Taking inspiration from the bandit
literature, we propose that an agent seeking out latent task structure must
explicitly represent and maintain its uncertainty over that structure as part
of its overall uncertainty about the environment. We introduce a practical
algorithm for doing this using two posterior distributions over state
abstractions and abstract-state values. In empirically validating our approach,
we find that substantial performance gains lie in the multi-task setting where
tasks share a common, low-dimensional representation.
Related papers
- Effective Reinforcement Learning Based on Structural Information Principles [19.82391136775341]
We propose a novel and general Structural Information principles-based framework for effective Decision-Making, namely SIDM.
SIDM can be flexibly incorporated into various single-agent and multi-agent RL algorithms, enhancing their performance.
arXiv Detail & Related papers (2024-04-15T13:02:00Z) - Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework.
It automatically generalizes the task given into smaller, more manageable subtasks.
It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z) - Hierarchical State Abstraction Based on Structural Information
Principles [70.24495170921075]
We propose a novel mathematical Structural Information principles-based State Abstraction framework, namely SISA, from the information-theoretic perspective.
SISA is a general framework that can be flexibly integrated with different representation-learning objectives to improve their performances further.
arXiv Detail & Related papers (2023-04-24T11:06:52Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - Learning Dynamic Abstract Representations for Sample-Efficient
Reinforcement Learning [22.25237742815589]
In many real-world problems, the learning agent needs to learn a problem's abstractions and solution simultaneously.
This paper presents a novel top-down approach for constructing state abstractions while carrying out reinforcement learning.
arXiv Detail & Related papers (2022-10-04T23:05:43Z) - Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy.
A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings.
An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z) - Causal Dynamics Learning for Task-Independent State Abstraction [61.707048209272884]
We introduce Causal Dynamics Learning for Task-Independent State Abstraction (CDL)
CDL learns a theoretically proved causal dynamics model that removes unnecessary dependencies between state variables and the action.
A state abstraction can then be derived from the learned dynamics.
arXiv Detail & Related papers (2022-06-27T17:02:53Z) - Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning [120.38381203153159]
Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
arXiv Detail & Related papers (2021-11-04T22:46:16Z) - Dynamic probabilistic logic models for effective abstractions in RL [35.54018388244684]
RePReL is a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning.
Our experiments show that RePReL not only achieves better performance and efficient learning on the task at hand but also demonstrates better generalization to unseen tasks.
arXiv Detail & Related papers (2021-10-15T18:53:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.