Group-Invariant Unsupervised Skill Discovery: Symmetry-aware Skill Representations for Generalizable Behavior
- URL: http://arxiv.org/abs/2601.14000v1
- Date: Tue, 20 Jan 2026 14:21:18 GMT
- Title: Group-Invariant Unsupervised Skill Discovery: Symmetry-aware Skill Representations for Generalizable Behavior
- Authors: Junwoo Chang, Joseph Park, Roberto Horowitz, Jongmin Lee, Jongeun Choi,
- Abstract summary: Group-Invariant Skill Discovery is a framework that embeds group structure into the skill discovery objective.<n>We show that GISD achieves broader state-space coverage and improved efficiency in downstream task learning compared to a strong baseline.
- Score: 7.469447825853364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised skill discovery aims to acquire behavior primitives that improve exploration and accelerate downstream task learning. However, existing approaches often ignore the geometric symmetries of physical environments, leading to redundant behaviors and sample inefficiency. To address this, we introduce Group-Invariant Skill Discovery (GISD), a framework that explicitly embeds group structure into the skill discovery objective. Our approach is grounded in a theoretical guarantee: we prove that in group-symmetric environments, the standard Wasserstein dependency measure admits a globally optimal solution comprised of an equivariant policy and a group-invariant scoring function. Motivated by this, we formulate the Group-Invariant Wasserstein dependency measure, which restricts the optimization to this symmetry-aware subspace without loss of optimality. Practically, we parameterize the scoring function using a group Fourier representation and define the intrinsic reward via the alignment of equivariant latent features, ensuring that the discovered skills generalize systematically under group transformations. Experiments on state-based and pixel-based locomotion benchmarks demonstrate that GISD achieves broader state-space coverage and improved efficiency in downstream task learning compared to a strong baseline.
Related papers
- An Adaptive KKT-Based Indicator for Convergence Assessment in Multi-Objective Optimization [0.0]
This paper revisits an entropy-inspired convergence indicator and proposes a robust adaptive reformulation based on quantile normalization.<n>The proposed indicator preserves the stationarity-based interpretation of the original formulation while improving robustness to heterogeneous distributions of stationarity residuals.
arXiv Detail & Related papers (2026-03-04T13:34:28Z) - SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning [50.93295951454092]
We introduce a set level diversity objective defined over sampled trajectories using kernelized similarity.<n>Our approach derives a leave-one-out marginal contribution for each sampled trajectory and integrates this objective as a plug-in advantage shaping term for policy optimization.<n>Experiments across a range of model scales demonstrate the effectiveness of our proposed algorithm, consistently outperforming strong baselines in both Pass@1 and Pass@K across various benchmarks.
arXiv Detail & Related papers (2026-02-01T07:13:20Z) - MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization [56.074760766965085]
Group-Relative Policy Optimization has emerged as an efficient paradigm for aligning Large Language Models (LLMs)<n>We propose MAESTRO, which treats reward scalarization as a dynamic latent policy, leveraging the model's terminal hidden states as a semantic bottleneck.<n>We formulate this as a contextual bandit problem within a bi-level optimization framework, where a lightweight Conductor network co-evolves with the policy by utilizing group-relative advantages as a meta-reward signal.
arXiv Detail & Related papers (2026-01-12T05:02:48Z) - Partially Equivariant Reinforcement Learning in Symmetry-Breaking Environments [10.122552307413711]
Group symmetries provide a powerful inductive bias for reinforcement learning (RL)<n>Group symmetries provide a powerful inductive bias for reinforcement learning (RL)
arXiv Detail & Related papers (2025-11-30T14:41:08Z) - Unified modality separation: A vision-language framework for unsupervised domain adaptation [60.8391821117794]
Unsupervised domain adaptation (UDA) enables models trained on a labeled source domain to handle new unlabeled domains.<n>We propose a unified modality separation framework that accommodates both modality-specific and modality-invariant components.<n>Our methods achieve up to 9% performance gain with 9 times of computational efficiencies.
arXiv Detail & Related papers (2025-08-07T02:51:10Z) - Equivariant Goal Conditioned Contrastive Reinforcement Learning [5.019456977535218]
Contrastive Reinforcement Learning (CRL) provides a promising framework for extracting useful structured representations from unlabeled interactions.<n>We propose Equivariant CRL, which further structures the latent space using equivariant constraints.<n>Our approach consistently outperforms strong baselines across a range of simulated tasks in both state-based and image-based settings.
arXiv Detail & Related papers (2025-07-22T01:13:45Z) - Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios [54.58186816693791]
environments constantly change over time and space, posing significant challenges for object detectors trained based on a closed-set assumption.<n>We propose a new mechanism, converting the fine-tuning process to a specific- parameter generation.<n>In particular, we first design a dual-path LoRA-based domain-aware adapter that disentangles features into domain-invariant and domain-specific components.
arXiv Detail & Related papers (2025-06-30T17:14:12Z) - Towards Initialization-Agnostic Clustering with Iterative Adaptive Resonance Theory [8.312275539092466]
Iterative Refinement Adaptive Resonance Theory (IR-ART) integrates three key phases into a unified iterative framework.<n> IR-ART improves tolerance to suboptimal vigilance parameter values while preserving the parameter simplicity of Fuzzy ART.<n>Case studies visually confirm the algorithm's self-optimization capability through iterative refinement.
arXiv Detail & Related papers (2025-05-07T14:12:39Z) - R2Det: Exploring Relaxed Rotation Equivariance in 2D object detection [26.05910177212846]
Group Equivariant Convolution (GConv) empowers models to explore underlying symmetry in data, improving performance.<n>We introduce a novel Relaxed Rotation-Equivariant GConv (R2GConv) with only a minimal increase of $4n$ parameters compared to GConv.<n>Based on R2GConv, we propose a Relaxed Rotation-Equivariant Network (R2Net) as the backbone and develop a Relaxed Rotation-Equivariant Object Detector (R2Det) for 2D object detection.
arXiv Detail & Related papers (2024-08-21T16:32:03Z) - Mitigating Group Bias in Federated Learning for Heterogeneous Devices [1.181206257787103]
Federated Learning is emerging as a privacy-preserving model training approach in distributed edge applications.
Our work proposes a group-fair FL framework that minimizes group-bias while preserving privacy and without resource utilization overhead.
arXiv Detail & Related papers (2023-09-13T16:53:48Z) - Adapting to Latent Subgroup Shifts via Concepts and Proxies [82.01141290360562]
We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variables available only in the source domain.
For continuous observations, we propose a latent variable model specific to the data generation process at hand.
arXiv Detail & Related papers (2022-12-21T18:30:22Z) - The Advantage of Conditional Meta-Learning for Biased Regularization and
Fine-Tuning [50.21341246243422]
Biased regularization and fine-tuning are two recent meta-learning approaches.
We propose conditional meta-learning, inferring a conditioning function mapping task's side information into a meta- parameter vector.
We then propose a convex meta-algorithm providing a comparable advantage also in practice.
arXiv Detail & Related papers (2020-08-25T07:32:16Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.