Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2505.13925v1
- Date: Tue, 20 May 2025 04:40:49 GMT
- Title: Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
- Authors: Yunpeng Jiang, Jianshu Hu, Paul Weng, Yutong Ban,
- Abstract summary: Time reversal symmetry is a form of temporal symmetry commonly found in robotics tasks such as door opening and closing.<n>We propose Time Reversal symmetry enhanced Deep Reinforcement Learning (TR-DRL), a framework that combines trajectory reversal augmentation and time reversal guided reward shaping.<n>Extensive experiments on the Robosuite and MetaWorld benchmarks demonstrate that TR-DRL is effective in both single-task and multi-task settings.
- Score: 6.461129780249323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symmetry is pervasive in robotics and has been widely exploited to improve sample efficiency in deep reinforcement learning (DRL). However, existing approaches primarily focus on spatial symmetries, such as reflection, rotation, and translation, while largely neglecting temporal symmetries. To address this gap, we explore time reversal symmetry, a form of temporal symmetry commonly found in robotics tasks such as door opening and closing. We propose Time Reversal symmetry enhanced Deep Reinforcement Learning (TR-DRL), a framework that combines trajectory reversal augmentation and time reversal guided reward shaping to efficiently solve temporally symmetric tasks. Our method generates reversed transitions from fully reversible transitions, identified by a proposed dynamics-consistent filter, to augment the training data. For partially reversible transitions, we apply reward shaping to guide learning, according to successful trajectories from the reversed task. Extensive experiments on the Robosuite and MetaWorld benchmarks demonstrate that TR-DRL is effective in both single-task and multi-task settings, achieving higher sample efficiency and stronger final performance compared to baseline methods.
Related papers
- IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-Tuning [25.642307880136332]
Imitation learning (IL) and reinforcement learning (RL) each offer distinct advantages for robotics policy learning.<n>While existing robot learning approaches using IL-based pre-training followed by RL-based fine-tuning are promising, this two-step learning paradigm often suffers from instability and poor sample efficiency during the RL fine-tuning phase.<n>In this work, we introduce IN-RIL, INterleaved Reinforcement learning and Imitation Learning, for policy fine-tuning.
arXiv Detail & Related papers (2025-05-15T16:01:21Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales [13.818149654692863]
Reinforcement learning (RL) training is inherently unstable due to factors such as moving targets and high gradient variance.
In this work, we improve the stability of RL training by adapting the reverse cross entropy (RCE) from supervised learning for noisy data to define a symmetric RL loss.
arXiv Detail & Related papers (2024-05-27T19:28:33Z) - Symmetry Considerations for Learning Task Symmetric Robot Policies [12.856889419651521]
Symmetry is a fundamental aspect of many real-world robotic tasks.
Current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively.
arXiv Detail & Related papers (2024-03-07T09:41:11Z) - Learning Unorthogonalized Matrices for Rotation Estimation [83.94986875750455]
Estimating 3D rotations is a common procedure for 3D computer vision.
One form of representation -- rotation matrices -- is popular due to its continuity.
We propose unorthogonalized Pseudo' Rotation Matrices (PRoM)
arXiv Detail & Related papers (2023-12-01T09:56:29Z) - An Investigation of Time Reversal Symmetry in Reinforcement Learning [18.375784421726287]
We formalize a concept of time reversal symmetry in a Markov decision process (MDP)
We observe that utilizing the structure of time reversal in an MDP allows every environment transition experienced by an agent to be transformed into a feasible reverse-time transition.
To test the usefulness of this newly synthesized data, we develop a novel approach called time symmetric data augmentation (TSDA)
arXiv Detail & Related papers (2023-11-28T18:02:06Z) - Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization [42.92248233465095]
We propose a simple but effective method, called symmetric replay training (SRT), which can be easily integrated into various Deep reinforcement learning (DRL) methods.
Our method leverages high-reward samples to encourage exploration of symmetric regions without additional online interactions - free.
Experimental results demonstrate the consistent improvement of our method in sample efficiency across diverse DRL methods applied to real-world tasks.
arXiv Detail & Related papers (2023-06-02T05:34:01Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Time-Reversal Symmetric ODE Network [138.02741983098454]
Time-reversal symmetry is a fundamental property that frequently holds in classical and quantum mechanics.
We propose a novel loss function that measures how well our ordinary differential equation (ODE) networks comply with this time-reversal symmetry.
We show that, even for systems that do not possess the full time-reversal symmetry, TRS-ODENs can achieve better predictive performances over baselines.
arXiv Detail & Related papers (2020-07-22T12:19:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.