VendiRL: A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills
- URL: http://arxiv.org/abs/2509.02930v2
- Date: Sun, 12 Oct 2025 18:35:39 GMT
- Title: VendiRL: A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills
- Authors: Erik M. Lintunen,
- Abstract summary: In self-supervised reinforcement learning (RL), one of the key challenges is learning a diverse set of skills to prepare agents for unknown future tasks.<n>We introduce VendiRL, a unified framework for learning diversely diverse sets of skills.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In self-supervised reinforcement learning (RL), one of the key challenges is learning a diverse set of skills to prepare agents for unknown future tasks. Despite impressive advances, scalability and evaluation remain prevalent issues. Regarding scalability, the search for meaningful skills can be obscured by high-dimensional feature spaces, where relevant features may vary across downstream task domains. For evaluating skill diversity, defining what constitutes "diversity" typically requires a hard commitment to a specific notion of what it means for skills to be diverse, potentially leading to inconsistencies in how skill diversity is understood, making results across different approaches hard to compare, and leaving many forms of diversity unexplored. To address these issues, we adopt a measure of sample diversity that translates ideas from ecology to machine learning -- the Vendi Score -- allowing the user to specify and evaluate any desired form of diversity. We demonstrate how this metric facilitates skill evaluation and introduce VendiRL, a unified framework for learning diversely diverse sets of skills. Given distinct similarity functions, VendiRL motivates distinct forms of diversity, which could support skill-diversity pretraining in new and richly interactive environments where optimising for various forms of diversity may be desirable.
Related papers
- Expanding Horizons of Level Diversity via Multi-objective Evolutionary Learning [10.755666953578336]
This paper aims to expand horizons of level diversity by considering multi-dimensional diversity when training generative models.<n>We formulate the model training as a multi-objective learning problem, where each diversity metric is treated as a distinct objective.<n>A multi-objective evolutionary learning framework that optimises multiple diversity metrics simultaneously throughout the model training process is proposed.
arXiv Detail & Related papers (2025-09-29T06:43:33Z) - A survey of diversity quantification in natural language processing: The why, what, where and how [2.5833049611832273]
We survey articles in the ACL Anthology from the past 6 years, with "diversity" or "diverse" in their title.<n>We put forward a unified taxonomy of why, what on, where, and how diversity is measured in NLP.<n>We believe that this study paves the way towards a better formalization of diversity in NLP.
arXiv Detail & Related papers (2025-07-28T14:12:34Z) - AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification [4.722248376235009]
Skill-based reinforcement learning (SBRL) enables rapid adaptation in environments with sparse rewards by pretraining a skill-conditioned policy.<n>We propose a new method, Adaptive Multi-objective Projection for balancing Exploration and skill Diversification (AMPED)<n>Our approach achieves performance that surpasses SBRL baselines across various benchmarks.
arXiv Detail & Related papers (2025-06-06T10:59:39Z) - Evaluating the Diversity and Quality of LLM Generated Content [72.84945252821908]
We introduce a framework for measuring effective semantic diversity--diversity among outputs that meet quality thresholds.<n>Although preference-tuned models exhibit reduced lexical and syntactic diversity, they produce greater effective semantic diversity than SFT or base models.<n>These findings have important implications for applications that require diverse yet high-quality outputs.
arXiv Detail & Related papers (2025-04-16T23:02:23Z) - The impact of behavioral diversity in multi-agent reinforcement learning [8.905920197601173]
We show how behavioral diversity synergizes with morphological diversity.<n>We show how behaviorally heterogeneous teams learn and retain latent skills to overcome repeated disruptions.
arXiv Detail & Related papers (2024-12-19T21:13:32Z) - Language Guided Skill Discovery [56.84356022198222]
We introduce Language Guided Skill Discovery (LGSD) to maximize semantic diversity between skills.<n>LGSD takes user prompts as input and outputs a set of semantically distinctive skills.<n>We demonstrate that LGSD enables legged robots to visit different user-intended areas on a plane by simply changing the prompt.
arXiv Detail & Related papers (2024-06-07T04:25:38Z) - Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts [58.220879689376744]
Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy.
We propose textbfDiverse textbfSkill textbfLearning (Di-SkilL) for learning diverse skills.
We show on challenging robot simulation tasks that Di-SkilL can learn diverse and performant skills.
arXiv Detail & Related papers (2024-03-11T17:49:18Z) - Diversify Question Generation with Retrieval-Augmented Style Transfer [68.00794669873196]
We propose RAST, a framework for Retrieval-Augmented Style Transfer.
The objective is to utilize the style of diverse templates for question generation.
We develop a novel Reinforcement Learning (RL) based approach that maximizes a weighted combination of diversity reward and consistency reward.
arXiv Detail & Related papers (2023-10-23T02:27:31Z) - Controlled Diversity with Preference : Towards Learning a Diverse Set of
Desired Skills [15.187171070594935]
We propose Controlled Diversity with Preference (CDP), a collaborative human-guided mechanism for an agent to learn a set of skills that is diverse as well as desirable.
The key principle is to restrict the discovery of skills to those regions that are deemed to be desirable as per a preference model trained using human preference labels on trajectory pairs.
We evaluate our approach on 2D navigation and Mujoco environments and demonstrate the ability to discover diverse, yet desirable skills.
arXiv Detail & Related papers (2023-03-07T03:37:47Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - Relative Variational Intrinsic Control [11.328970848714919]
Relative Variational Intrinsic Control (RVIC) incentivizes learning skills that are distinguishable in how they change the agent's relationship to its environment.
We show how RVIC skills are more useful than skills discovered by existing methods when used in hierarchical reinforcement learning.
arXiv Detail & Related papers (2020-12-14T18:59:23Z) - Evaluating the Evaluation of Diversity in Natural Language Generation [43.05127848086264]
We propose a framework for evaluating diversity metrics in natural language generation systems.
Our framework can advance the understanding of different diversity metrics, an essential step on the road towards better NLG systems.
arXiv Detail & Related papers (2020-04-06T20:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.