Exploration by Learning Diverse Skills through Successor State Measures
- URL: http://arxiv.org/abs/2406.10127v1
- Date: Fri, 14 Jun 2024 15:36:15 GMT
- Title: Exploration by Learning Diverse Skills through Successor State Measures
- Authors: Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Konigsbuch, Dennis G. Wilson, Emmanuel Rachelson,
- Abstract summary: We aim to construct a set of diverse skills which uniformly cover state space.
We consider the distribution of states reached by a policy conditioned on each skill and leverage the successor state measure.
Our findings demonstrate that this new formalization promotes more robust and efficient exploration.
- Score: 5.062282108230929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on each skill and leverage the successor state measure to maximize the difference between these skill distributions. We call this approach LEADS: Learning Diverse Skills through Successor States. We demonstrate our approach on a set of maze navigation and robotic control tasks which show that our method is capable of constructing a diverse set of skills which exhaustively cover the state space without relying on reward or exploration bonuses. Our findings demonstrate that this new formalization promotes more robust and efficient exploration by combining mutual information maximization and exploration bonuses.
Related papers
- SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions [48.003320766433966]
This work introduces Skill Discovery from Local Dependencies (Skild)
Skild develops a novel skill learning objective that explicitly encourages the mastering of skills that induce different interactions within an environment.
We evaluate Skild in several domains with challenging, long-horizon sparse reward tasks including a realistic simulated household robot domain.
arXiv Detail & Related papers (2024-10-24T04:01:59Z) - Language Guided Skill Discovery [56.84356022198222]
We introduce Language Guided Skill Discovery (LGSD) to maximize semantic diversity between skills.
LGSD takes user prompts as input and outputs a set of semantically distinctive skills.
We demonstrate that LGSD enables legged robots to visit different user-intended areas on a plane by simply changing the prompt.
arXiv Detail & Related papers (2024-06-07T04:25:38Z) - Constrained Ensemble Exploration for Unsupervised Skill Discovery [43.00837365639085]
Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training.
We propose a novel unsupervised RL framework via an ensemble of skills, where each skill performs partition exploration based on the state prototypes.
We find our method learns well-explored ensemble skills and achieves superior performance in various downstream tasks compared to previous methods.
arXiv Detail & Related papers (2024-05-25T03:07:56Z) - Unsupervised Discovery of Continuous Skills on a Sphere [15.856188608650228]
We propose a novel method for learning potentially an infinite number of different skills, which is named discovery of continuous skills on a sphere (DISCS)
In DISCS, skills are learned by maximizing mutual information between skills and states, and each skill corresponds to a continuous value on a sphere.
Because the representations of skills in DISCS are continuous, infinitely diverse skills could be learned.
arXiv Detail & Related papers (2023-05-21T06:29:41Z) - Behavior Contrastive Learning for Unsupervised Skill Discovery [75.6190748711826]
We propose a novel unsupervised skill discovery method through contrastive learning among behaviors.
Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill.
Our method implicitly increases the state entropy to obtain better state coverage.
arXiv Detail & Related papers (2023-05-08T06:02:11Z) - Direct then Diffuse: Incremental Unsupervised Skill Discovery for State
Covering and Goal Reaching [98.25207998996066]
We build on the mutual information framework for skill discovery and introduce UPSIDE to address the coverage-directedness trade-off.
We illustrate in several navigation and control environments how the skills learned by UPSIDE solve sparse-reward downstream tasks better than existing baselines.
arXiv Detail & Related papers (2021-10-27T14:22:19Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - ELSIM: End-to-end learning of reusable skills through intrinsic
motivation [0.0]
We present a novel reinforcement learning architecture which hierarchically learns and represents self-generated skills in an end-to-end way.
With this architecture, an agent focuses only on task-rewarded skills while keeping the learning process of skills bottom-up.
arXiv Detail & Related papers (2020-06-23T11:20:46Z) - Skill Discovery of Coordination in Multi-agent Reinforcement Learning [41.67943127631515]
We propose "Multi-agent Skill Discovery"(MASD), a method for discovering skills for coordination patterns of multiple agents.
We show the emergence of various skills on the level of coordination in a general particle multi-agent environment.
We also reveal that the "bottleneck" prevents skills from collapsing to a single agent and enhances the diversity of learned skills.
arXiv Detail & Related papers (2020-06-07T02:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.