Controllability-Aware Unsupervised Skill Discovery
- URL: http://arxiv.org/abs/2302.05103v3
- Date: Sat, 3 Jun 2023 23:35:22 GMT
- Title: Controllability-Aware Unsupervised Skill Discovery
- Authors: Seohong Park, Kimin Lee, Youngwoon Lee, Pieter Abbeel
- Abstract summary: We introduce a novel unsupervised skill discovery method, Controllability-aware Skill Discovery (CSD), which actively seeks complex, hard-to-control skills without supervision.
The key component of CSD is a controllability-aware distance function, which assigns larger values to state transitions that are harder to achieve with the current skills.
Our experimental results in six robotic manipulation and locomotion environments demonstrate that CSD can discover diverse complex skills with no supervision.
- Score: 94.19932297743439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the key capabilities of intelligent agents is the ability to discover
useful skills without external supervision. However, the current unsupervised
skill discovery methods are often limited to acquiring simple, easy-to-learn
skills due to the lack of incentives to discover more complex, challenging
behaviors. We introduce a novel unsupervised skill discovery method,
Controllability-aware Skill Discovery (CSD), which actively seeks complex,
hard-to-control skills without supervision. The key component of CSD is a
controllability-aware distance function, which assigns larger values to state
transitions that are harder to achieve with the current skills. Combined with
distance-maximizing skill discovery, CSD progressively learns more challenging
skills over the course of training as our jointly trained distance function
reduces rewards for easy-to-achieve skills. Our experimental results in six
robotic manipulation and locomotion environments demonstrate that CSD can
discover diverse complex skills including object manipulation and locomotion
skills with no supervision, significantly outperforming prior unsupervised
skill discovery methods. Videos and code are available at
https://seohong.me/projects/csd/
Related papers
- SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions [48.003320766433966]
This work introduces Skill Discovery from Local Dependencies (Skild)
Skild develops a novel skill learning objective that explicitly encourages the mastering of skills that induce different interactions within an environment.
We evaluate Skild in several domains with challenging, long-horizon sparse reward tasks including a realistic simulated household robot domain.
arXiv Detail & Related papers (2024-10-24T04:01:59Z) - Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning [39.991887534269445]
Disentangled Unsupervised Skill Discovery (DUSDi) is a method for learning disentangled skills that can be efficiently reused to solve downstream tasks.
DUSDi decomposes skills into disentangled components, where each skill component only affects one factor of the state space.
DUSDi successfully learns disentangled skills, and significantly outperforms previous skill discovery methods when it comes to applying the learned skills to solve downstream tasks.
arXiv Detail & Related papers (2024-10-15T04:13:20Z) - Agentic Skill Discovery [19.5703917813767]
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control.
A remaining challenge is to acquire a diverse set of fundamental skills.
We introduce a novel framework for skill discovery that is entirely driven by LLMs.
arXiv Detail & Related papers (2024-05-23T19:44:03Z) - SLIM: Skill Learning with Multiple Critics [8.645929825516818]
Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment.
Latent variable models, based on mutual information, have been successful in this task but still struggle in the context of robotic manipulation.
We introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation.
arXiv Detail & Related papers (2024-02-01T18:07:33Z) - Behavior Contrastive Learning for Unsupervised Skill Discovery [75.6190748711826]
We propose a novel unsupervised skill discovery method through contrastive learning among behaviors.
Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill.
Our method implicitly increases the state entropy to obtain better state coverage.
arXiv Detail & Related papers (2023-05-08T06:02:11Z) - Choreographer: Learning and Adapting Skills in Imagination [60.09911483010824]
We present Choreographer, a model-based agent that exploits its world model to learn and adapt skills in imagination.
Our method decouples the exploration and skill learning processes, being able to discover skills in the latent state space of the model.
Choreographer is able to learn skills both from offline data, and by collecting data simultaneously with an exploration policy.
arXiv Detail & Related papers (2022-11-23T23:31:14Z) - Unsupervised Reinforcement Learning for Transferable Manipulation Skill
Discovery [22.32327908453603]
Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new downstream tasks.
We propose a framework that pre-trains the agent in a task-agnostic manner without access to the task-specific reward.
We show that our approach achieves the most diverse interacting behavior and significantly improves sample efficiency in downstream tasks.
arXiv Detail & Related papers (2022-04-29T06:57:46Z) - Lipschitz-constrained Unsupervised Skill Discovery [91.51219447057817]
Lipschitz-constrained Skill Discovery (LSD) encourages the agent to discover more diverse, dynamic, and far-reaching skills.
LSD outperforms previous approaches in terms of skill diversity, state space coverage, and performance on seven downstream tasks.
arXiv Detail & Related papers (2022-02-02T08:29:04Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.