Learning Discrete State Abstractions With Deep Variational Inference
- URL: http://arxiv.org/abs/2003.04300v3
- Date: Mon, 11 Jan 2021 18:06:21 GMT
- Title: Learning Discrete State Abstractions With Deep Variational Inference
- Authors: Ondrej Biza, Robert Platt, Jan-Willem van de Meent and Lawson L. S.
Wong
- Abstract summary: We propose a method for learning approximate bisimulations, a type of state abstraction.
We use a deep neural encoder to map states onto continuous embeddings.
We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model.
- Score: 7.273663549650618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstraction is crucial for effective sequential decision making in domains
with large state spaces. In this work, we propose an information bottleneck
method for learning approximate bisimulations, a type of state abstraction. We
use a deep neural encoder to map states onto continuous embeddings. We map
these embeddings onto a discrete representation using an action-conditioned
hidden Markov model, which is trained end-to-end with the neural network. Our
method is suited for environments with high-dimensional states and learns from
a stream of experience collected by an agent acting in a Markov decision
process. Through this learned discrete abstract model, we can efficiently plan
for unseen goals in a multi-goal Reinforcement Learning setting. We test our
method in simplified robotic manipulation domains with image states. We also
compare it against previous model-based approaches to finding bisimulations in
discrete grid-world-like environments. Source code is available at
https://github.com/ondrejba/discrete_abstractions.
Related papers
- Linking in Style: Understanding learned features in deep learning models [0.0]
Convolutional neural networks (CNNs) learn abstract features to perform object classification.
We propose an automatic method to visualize and systematically analyze learned features in CNNs.
arXiv Detail & Related papers (2024-09-25T12:28:48Z) - Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning.
We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels.
We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z) - Ideal Abstractions for Decision-Focused Learning [108.15241246054515]
We propose a method that configures the output space automatically in order to minimize the loss of decision-relevant information.
We demonstrate the method in two domains: data acquisition for deep neural network training and a closed-loop wildfire management task.
arXiv Detail & Related papers (2023-03-29T23:31:32Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - Discrete State-Action Abstraction via the Successor Representation [3.453310639983932]
Abstraction is one approach that provides the agent with an intrinsic reward for transitioning in a latent space.
Our approach is the first for automatically learning a discrete abstraction of the underlying environment.
Our proposed algorithm, Discrete State-Action Abstraction (DSAA), iteratively swaps between training these options and using them to efficiently explore more of the environment.
arXiv Detail & Related papers (2022-06-07T17:37:30Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - State Representation Learning for Goal-Conditioned Reinforcement
Learning [9.162936410696407]
This paper presents a novel state representation for reward-free Markov decision processes.
The idea is to learn, in a self-supervised manner, an embedding space where between pairs of embedded states correspond to the minimum number of actions needed to transition between them.
We show how this representation can be leveraged to learn goal-conditioned policies.
arXiv Detail & Related papers (2022-05-04T09:20:09Z) - Multi-Branch Deep Radial Basis Function Networks for Facial Emotion
Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units.
RBF units capture local patterns shared by similar instances using an intermediate representation.
We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.