Behavior Contrastive Learning for Unsupervised Skill Discovery
- URL: http://arxiv.org/abs/2305.04477v1
- Date: Mon, 8 May 2023 06:02:11 GMT
- Title: Behavior Contrastive Learning for Unsupervised Skill Discovery
- Authors: Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang,
Peng Liu, Xuelong Li
- Abstract summary: We propose a novel unsupervised skill discovery method through contrastive learning among behaviors.
Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill.
Our method implicitly increases the state entropy to obtain better state coverage.
- Score: 75.6190748711826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In reinforcement learning, unsupervised skill discovery aims to learn diverse
skills without extrinsic rewards. Previous methods discover skills by
maximizing the mutual information (MI) between states and skills. However, such
an MI objective tends to learn simple and static skills and may hinder
exploration. In this paper, we propose a novel unsupervised skill discovery
method through contrastive learning among behaviors, which makes the agent
produce similar behaviors for the same skill and diverse behaviors for
different skills. Under mild assumptions, our objective maximizes the MI
between different behaviors based on the same skill, which serves as an upper
bound of the previous MI objective. Meanwhile, our method implicitly increases
the state entropy to obtain better state coverage. We evaluate our method on
challenging mazes and continuous control tasks. The results show that our
method generates diverse and far-reaching skills, and also obtains competitive
performance in downstream tasks compared to the state-of-the-art methods.
Related papers
- SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions [48.003320766433966]
This work introduces Skill Discovery from Local Dependencies (Skild)
Skild develops a novel skill learning objective that explicitly encourages the mastering of skills that induce different interactions within an environment.
We evaluate Skild in several domains with challenging, long-horizon sparse reward tasks including a realistic simulated household robot domain.
arXiv Detail & Related papers (2024-10-24T04:01:59Z) - Exploration by Learning Diverse Skills through Successor State Measures [5.062282108230929]
We aim to construct a set of diverse skills which uniformly cover state space.
We consider the distribution of states reached by a policy conditioned on each skill and leverage the successor state measure.
Our findings demonstrate that this new formalization promotes more robust and efficient exploration.
arXiv Detail & Related papers (2024-06-14T15:36:15Z) - SLIM: Skill Learning with Multiple Critics [8.645929825516818]
Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment.
Latent variable models, based on mutual information, have been successful in this task but still struggle in the context of robotic manipulation.
We introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation.
arXiv Detail & Related papers (2024-02-01T18:07:33Z) - Unsupervised Discovery of Continuous Skills on a Sphere [15.856188608650228]
We propose a novel method for learning potentially an infinite number of different skills, which is named discovery of continuous skills on a sphere (DISCS)
In DISCS, skills are learned by maximizing mutual information between skills and states, and each skill corresponds to a continuous value on a sphere.
Because the representations of skills in DISCS are continuous, infinitely diverse skills could be learned.
arXiv Detail & Related papers (2023-05-21T06:29:41Z) - Skill-Based Reinforcement Learning with Intrinsic Reward Matching [77.34726150561087]
We present Intrinsic Reward Matching (IRM), which unifies task-agnostic skill pretraining and task-aware finetuning.
IRM enables us to utilize pretrained skills far more effectively than previous skill selection methods.
arXiv Detail & Related papers (2022-10-14T00:04:49Z) - Lipschitz-constrained Unsupervised Skill Discovery [91.51219447057817]
Lipschitz-constrained Skill Discovery (LSD) encourages the agent to discover more diverse, dynamic, and far-reaching skills.
LSD outperforms previous approaches in terms of skill diversity, state space coverage, and performance on seven downstream tasks.
arXiv Detail & Related papers (2022-02-02T08:29:04Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - Relative Variational Intrinsic Control [11.328970848714919]
Relative Variational Intrinsic Control (RVIC) incentivizes learning skills that are distinguishable in how they change the agent's relationship to its environment.
We show how RVIC skills are more useful than skills discovered by existing methods when used in hierarchical reinforcement learning.
arXiv Detail & Related papers (2020-12-14T18:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.