Related papers: SUSD: Structured Unsupervised Skill Discovery through State Factorization

SUSD: Structured Unsupervised Skill Discovery through State Factorization

URL: http://arxiv.org/abs/2602.01619v1
Date: Mon, 02 Feb 2026 04:21:33 GMT
Title: SUSD: Structured Unsupervised Skill Discovery through State Factorization
Authors: Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah,
Abstract summary: Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards.<n>We introduce SUSD, a novel framework that harnesses the compositional structure of environments by factorizing the state space into independent components.<n>SUSD allocates distinct skill variables to different factors, enabling more fine-grained control on the skill discovery process.
Score: 12.57032768854794
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards. One of the most common USD approaches is to maximize the Mutual Information (MI) between skill latent variables and states. However, MI-based methods tend to favor simple, static skills due to their invariance properties, limiting the discovery of dynamic, task-relevant behaviors. Distance-Maximizing Skill Discovery (DSD) promotes more dynamic skills by leveraging state-space distances, yet still fall short in encouraging comprehensive skill sets that engage all controllable factors or entities in the environment. In this work, we introduce SUSD, a novel framework that harnesses the compositional structure of environments by factorizing the state space into independent components (e.g., objects or controllable entities). SUSD allocates distinct skill variables to different factors, enabling more fine-grained control on the skill discovery process. A dynamic model also tracks learning across factors, adaptively steering the agent's focus toward underexplored factors. This structured approach not only promotes the discovery of richer and more diverse skills, but also yields a factorized skill representation that enables fine-grained and disentangled control over individual entities which facilitates efficient training of compositional downstream tasks via Hierarchical Reinforcement Learning (HRL). Our experimental results across three environments, with factors ranging from 1 to 10, demonstrate that our method can discover diverse and complex skills without supervision, significantly outperforming existing unsupervised skill discovery methods in factorized and complex environments. Code is publicly available at: https://github.com/hadi-hosseini/SUSD.

Related papers

Unsupervised Skill Discovery through Skill Regions Differentiation [6.088346462603191]
Unsupervised Reinforcement Learning (RL) aims to discover diverse behaviors that can accelerate the learning of downstream tasks.<n>We propose a novel skill discovery objective that maximizes the deviation of the state density of one skill from the explored regions of other skills.<n>We also formulate an intrinsic reward based on the learned autoencoder that resembles count-based exploration in a compact latent space.
arXiv Detail & Related papers (2025-06-17T11:30:04Z)
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions [48.003320766433966]
This work introduces Skill Discovery from Local Dependencies (Skild) Skild develops a novel skill learning objective that explicitly encourages the mastering of skills that induce different interactions within an environment. We evaluate Skild in several domains with challenging, long-horizon sparse reward tasks including a realistic simulated household robot domain.
arXiv Detail & Related papers (2024-10-24T04:01:59Z)
Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning [39.991887534269445]
Disentangled Unsupervised Skill Discovery (DUSDi) is a method for learning disentangled skills that can be efficiently reused to solve downstream tasks. DUSDi decomposes skills into disentangled components, where each skill component only affects one factor of the state space. DUSDi successfully learns disentangled skills, and significantly outperforms previous skill discovery methods when it comes to applying the learned skills to solve downstream tasks.
arXiv Detail & Related papers (2024-10-15T04:13:20Z)
Constrained Ensemble Exploration for Unsupervised Skill Discovery [43.00837365639085]
Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training. We propose a novel unsupervised RL framework via an ensemble of skills, where each skill performs partition exploration based on the state prototypes. We find our method learns well-explored ensemble skills and achieves superior performance in various downstream tasks compared to previous methods.
arXiv Detail & Related papers (2024-05-25T03:07:56Z)
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution [75.2573501625811]
Diffusion models have demonstrated strong potential for robotic trajectory planning. generating coherent trajectories from high-level instructions remains challenging. We propose SkillDiffuser, an end-to-end hierarchical planning framework.
arXiv Detail & Related papers (2023-12-18T18:16:52Z)
ComSD: Balancing Behavioral Quality and Diversity in Unsupervised Skill Discovery [12.277005054008017]
We propose textbfContrastive dynatextbfmic textbfSkill textbfDiscovery textbf(ComSD).<n>ComSD generates diverse and exploratory unsupervised skills through a novel intrinsic incentive, named contrastive dynamic reward.<n>It can also discover distinguishable and far-reaching exploration skills in the challenging tree-like 2D maze.
arXiv Detail & Related papers (2023-09-29T12:53:41Z)
C$\cdot$ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters [49.83342243500835]
We present C$cdot$ASE, an efficient framework that learns conditional Adversarial Skill Embeddings for physics-based characters. C$cdot$ASE divides the heterogeneous skill motions into distinct subsets containing homogeneous samples for training a low-level conditional model. The skill-conditioned imitation learning naturally offers explicit control over the character's skills after training.
arXiv Detail & Related papers (2023-09-20T14:34:45Z)
Behavior Contrastive Learning for Unsupervised Skill Discovery [75.6190748711826]
We propose a novel unsupervised skill discovery method through contrastive learning among behaviors. Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill. Our method implicitly increases the state entropy to obtain better state coverage.
arXiv Detail & Related papers (2023-05-08T06:02:11Z)
Controllability-Aware Unsupervised Skill Discovery [94.19932297743439]
We introduce a novel unsupervised skill discovery method, Controllability-aware Skill Discovery (CSD), which actively seeks complex, hard-to-control skills without supervision. The key component of CSD is a controllability-aware distance function, which assigns larger values to state transitions that are harder to achieve with the current skills. Our experimental results in six robotic manipulation and locomotion environments demonstrate that CSD can discover diverse complex skills with no supervision.
arXiv Detail & Related papers (2023-02-10T08:03:09Z)
Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching [98.25207998996066]
We build on the mutual information framework for skill discovery and introduce UPSIDE to address the coverage-directedness trade-off. We illustrate in several navigation and control environments how the skills learned by UPSIDE solve sparse-reward downstream tasks better than existing baselines.
arXiv Detail & Related papers (2021-10-27T14:22:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.