APART: Diverse Skill Discovery using All Pairs with Ascending Reward and
DropouT
- URL: http://arxiv.org/abs/2308.12649v1
- Date: Thu, 24 Aug 2023 08:46:43 GMT
- Title: APART: Diverse Skill Discovery using All Pairs with Ascending Reward and
DropouT
- Authors: Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen
- Abstract summary: We study diverse skill discovery in reward-free environments, aiming to discover all possible skills in simple grid-world environments.
This problem is formulated as mutual training of skills using an intrinsic reward and a discriminator trained to predict a skill given its trajectory.
Our initial solution replaces the standard one-vs-all (softmax) discriminator with a one-vs-one (all pairs) discriminator and combines it with a novel intrinsic reward function and a dropout regularization technique.
- Score: 16.75358022780262
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study diverse skill discovery in reward-free environments, aiming to
discover all possible skills in simple grid-world environments where prior
methods have struggled to succeed. This problem is formulated as mutual
training of skills using an intrinsic reward and a discriminator trained to
predict a skill given its trajectory. Our initial solution replaces the
standard one-vs-all (softmax) discriminator with a one-vs-one (all pairs)
discriminator and combines it with a novel intrinsic reward function and a
dropout regularization technique. The combined approach is named APART: Diverse
Skill Discovery using All Pairs with Ascending Reward and Dropout. We
demonstrate that APART discovers all the possible skills in grid worlds with
remarkably fewer samples than previous works. Motivated by the empirical
success of APART, we further investigate an even simpler algorithm that
achieves maximum skills by altering VIC, rescaling its intrinsic reward, and
tuning the temperature of its softmax discriminator. We believe our findings
shed light on the crucial factors underlying success of skill discovery
algorithms in reinforcement learning.
Related papers
- Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning [39.991887534269445]
Disentangled Unsupervised Skill Discovery (DUSDi) is a method for learning disentangled skills that can be efficiently reused to solve downstream tasks.
DUSDi decomposes skills into disentangled components, where each skill component only affects one factor of the state space.
DUSDi successfully learns disentangled skills, and significantly outperforms previous skill discovery methods when it comes to applying the learned skills to solve downstream tasks.
arXiv Detail & Related papers (2024-10-15T04:13:20Z) - Constrained Ensemble Exploration for Unsupervised Skill Discovery [43.00837365639085]
Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training.
We propose a novel unsupervised RL framework via an ensemble of skills, where each skill performs partition exploration based on the state prototypes.
We find our method learns well-explored ensemble skills and achieves superior performance in various downstream tasks compared to previous methods.
arXiv Detail & Related papers (2024-05-25T03:07:56Z) - Unsupervised Discovery of Continuous Skills on a Sphere [15.856188608650228]
We propose a novel method for learning potentially an infinite number of different skills, which is named discovery of continuous skills on a sphere (DISCS)
In DISCS, skills are learned by maximizing mutual information between skills and states, and each skill corresponds to a continuous value on a sphere.
Because the representations of skills in DISCS are continuous, infinitely diverse skills could be learned.
arXiv Detail & Related papers (2023-05-21T06:29:41Z) - Behavior Contrastive Learning for Unsupervised Skill Discovery [75.6190748711826]
We propose a novel unsupervised skill discovery method through contrastive learning among behaviors.
Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill.
Our method implicitly increases the state entropy to obtain better state coverage.
arXiv Detail & Related papers (2023-05-08T06:02:11Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Skill-Based Reinforcement Learning with Intrinsic Reward Matching [77.34726150561087]
We present Intrinsic Reward Matching (IRM), which unifies task-agnostic skill pretraining and task-aware finetuning.
IRM enables us to utilize pretrained skills far more effectively than previous skill selection methods.
arXiv Detail & Related papers (2022-10-14T00:04:49Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.