Partially Equivariant Reinforcement Learning in Symmetry-Breaking Environments
- URL: http://arxiv.org/abs/2512.00915v1
- Date: Sun, 30 Nov 2025 14:41:08 GMT
- Title: Partially Equivariant Reinforcement Learning in Symmetry-Breaking Environments
- Authors: Junwoo Chang, Minwoo Park, Joohwan Seo, Roberto Horowitz, Jongmin Lee, Jongeun Choi,
- Abstract summary: Group symmetries provide a powerful inductive bias for reinforcement learning (RL)<n>Group symmetries provide a powerful inductive bias for reinforcement learning (RL)
- Score: 10.122552307413711
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Group symmetries provide a powerful inductive bias for reinforcement learning (RL), enabling efficient generalization across symmetric states and actions via group-invariant Markov Decision Processes (MDPs). However, real-world environments almost never realize fully group-invariant MDPs; dynamics, actuation limits, and reward design usually break symmetries, often only locally. Under group-invariant Bellman backups for such cases, local symmetry-breaking introduces errors that propagate across the entire state-action space, resulting in global value estimation errors. To address this, we introduce Partially group-Invariant MDP (PI-MDP), which selectively applies group-invariant or standard Bellman backups depending on where symmetry holds. This framework mitigates error propagation from locally broken symmetries while maintaining the benefits of equivariance, thereby enhancing sample efficiency and generalizability. Building on this framework, we present practical RL algorithms -- Partially Equivariant (PE)-DQN for discrete control and PE-SAC for continuous control -- that combine the benefits of equivariance with robustness to symmetry-breaking. Experiments across Grid-World, locomotion, and manipulation benchmarks demonstrate that PE-DQN and PE-SAC significantly outperform baselines, highlighting the importance of selective symmetry exploitation for robust and sample-efficient RL.
Related papers
- Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs [55.77845440440496]
Push-based decentralized communication enables optimization over communication networks, where information exchange may be asymmetric.<n>We develop a unified uniform-stability framework for the Gradient Push (SGP) algorithm.<n>A key technical ingredient is an imbalance-aware generalization bound through two quantities.
arXiv Detail & Related papers (2026-02-24T05:32:03Z) - Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Recurrent Equivariant Constraint Modulation: Learning Per-Layer Symmetry Relaxation from Data [36.287199718605]
Equivariant neural networks exploit underlying task symmetries to improve generalization.<n>We propose Recurrent Equivariant Constraint Modulation (RECM), a layer-wise constraint modulation mechanism.<n>RECM learns appropriate relaxation levels solely from the training signal and the symmetry properties of each layer's input-target distribution.
arXiv Detail & Related papers (2026-02-02T21:59:35Z) - Group-Invariant Unsupervised Skill Discovery: Symmetry-aware Skill Representations for Generalizable Behavior [7.469447825853364]
Group-Invariant Skill Discovery is a framework that embeds group structure into the skill discovery objective.<n>We show that GISD achieves broader state-space coverage and improved efficiency in downstream task learning compared to a strong baseline.
arXiv Detail & Related papers (2026-01-20T14:21:18Z) - Symmetry-Aware Steering of Equivariant Diffusion Policies: Benefits and Limits [5.63508094975827]
Equivariant diffusion policies (EDPs) combine the generative expressivity of diffusion models with the strong generalization and sample efficiency afforded by geometric symmetries.<n>We show that exploiting symmetry during the steering process yields substantial benefits-enhancing sample efficiency, preventing value divergence, and achieving strong policy improvements even when EDPs are trained from extremely limited demonstrations.
arXiv Detail & Related papers (2025-12-12T07:42:01Z) - Reinforcement Learning Using known Invariances [54.91261509214309]
This paper develops a theoretical framework for incorporating known group symmetries into kernel-based reinforcement learning.<n>We show that symmetry-aware RL achieves significantly better performance than their standard kernel counterparts.
arXiv Detail & Related papers (2025-11-05T13:56:14Z) - Equivariant Goal Conditioned Contrastive Reinforcement Learning [5.019456977535218]
Contrastive Reinforcement Learning (CRL) provides a promising framework for extracting useful structured representations from unlabeled interactions.<n>We propose Equivariant CRL, which further structures the latent space using equivariant constraints.<n>Our approach consistently outperforms strong baselines across a range of simulated tasks in both state-based and image-based settings.
arXiv Detail & Related papers (2025-07-22T01:13:45Z) - NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z) - Learning (Approximately) Equivariant Networks via Constrained Optimization [25.51476313302483]
Equivariant neural networks are designed to respect symmetries through their architecture.<n>Real-world data often departs from perfect symmetry because of noise, structural variation, measurement bias, or other symmetry-breaking effects.<n>We introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model.
arXiv Detail & Related papers (2025-05-19T18:08:09Z) - Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two.<n>Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z) - Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning [5.69473229553916]
This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components.
We show how equivariant ensembles and regularization benefit sample efficiency and performance.
arXiv Detail & Related papers (2024-03-19T16:01:25Z) - Winning Prize Comes from Losing Tickets: Improve Invariant Learning by
Exploring Variant Parameters for Out-of-Distribution Generalization [76.27711056914168]
Out-of-Distribution (OOD) Generalization aims to learn robust models that generalize well to various environments without fitting to distribution-specific features.
Recent studies based on Lottery Ticket Hypothesis (LTH) address this problem by minimizing the learning target to find some of the parameters that are critical to the task.
We propose Exploring Variant parameters for Invariant Learning (EVIL) which also leverages the distribution knowledge to find the parameters that are sensitive to distribution shift.
arXiv Detail & Related papers (2023-10-25T06:10:57Z) - Meta-Causal Feature Learning for Out-of-Distribution Generalization [71.38239243414091]
This paper presents a balanced meta-causal learner (BMCL), which includes a balanced task generation module (BTG) and a meta-causal feature learning module (MCFL)
BMCL effectively identifies the class-invariant visual regions for classification and may serve as a general framework to improve the performance of the state-of-the-art methods.
arXiv Detail & Related papers (2022-08-22T09:07:02Z) - Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations.
Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting.
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.