Related papers: State Representation Learning for Goal-Conditioned Reinforcement Learning

State Representation Learning for Goal-Conditioned Reinforcement Learning

URL: http://arxiv.org/abs/2205.01965v1
Date: Wed, 4 May 2022 09:20:09 GMT
Title: State Representation Learning for Goal-Conditioned Reinforcement Learning
Authors: Lorenzo Steccanella, Anders Jonsson
Abstract summary: This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where between pairs of embedded states correspond to the minimum number of actions needed to transition between them. We show how this representation can be leveraged to learn goal-conditioned policies.
Score: 9.162936410696407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

Related papers

Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation [74.27130400558013]
This paper proposes a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN)<n>CAN uses the domain-collaborative and domain-adversarial learning strategy for training the neural network.<n>To further enhance the discriminability in the target domain, we propose Self-Paced CAN (SPCAN)
arXiv Detail & Related papers (2025-06-24T02:58:37Z)
CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples. Existing methods in video action recognition rely on large labeled datasets from the same domain. We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z)
Bootstrapped Representations in Reinforcement Learning [44.49675960752777]
In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. We provide a theoretical characterization of the state representation learnt by temporal difference learning. We describe the efficacy of these representations for policy evaluation, and use our theoretical analysis to design new auxiliary learning rules.
arXiv Detail & Related papers (2023-06-16T20:14:07Z)
Learn what matters: cross-domain imitation learning with task-relevant embeddings [77.34726150561087]
We study how an autonomous agent learns to perform a task from demonstrations in a different domain, such as a different environment or different agent. We propose a scalable framework that enables cross-domain imitation learning without access to additional demonstrations or further domain knowledge.
arXiv Detail & Related papers (2022-09-24T21:56:58Z)
Learning Markov State Abstractions for Deep Reinforcement Learning [17.34529517221924]
We introduce a novel set of conditions and prove that they are sufficient for learning a Markov abstract state representation. We then describe a practical training procedure that combines inverse model estimation and temporal contrastive learning. Our approach learns representations that capture the underlying structure of the domain and lead to improved sample efficiency.
arXiv Detail & Related papers (2021-06-08T14:12:36Z)
MICo: Learning improved representations via sampling-based state similarity for Markov decision processes [18.829939056796313]
We present a new behavioural distance over the state space of a Markov decision process. We demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents.
arXiv Detail & Related papers (2021-06-03T14:24:12Z)
Cross-domain Imitation from Observations [50.669343548588294]
Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior. In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP. We present a novel framework to learn correspondences across such domains.
arXiv Detail & Related papers (2021-05-20T21:08:25Z)
Jointly-Learned State-Action Embedding for Efficient Reinforcement Learning [8.342863878589332]
We propose a new approach for learning embeddings for states and actions that combines aspects of model-free and model-based reinforcement learning. We show that our approach significantly outperforms state-of-the-art models in both discrete/continuous domains with large state/action spaces.
arXiv Detail & Related papers (2020-10-09T09:09:31Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
Learning Discrete State Abstractions With Deep Variational Inference [7.273663549650618]
We propose a method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model.
arXiv Detail & Related papers (2020-03-09T17:58:27Z)
Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.