Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation
- URL: http://arxiv.org/abs/2401.09258v2
- Date: Fri, 21 Mar 2025 08:19:55 GMT
- Title: Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation
- Authors: Yinuo Zhao, Kun Wu, Tianjiao Yi, Zhiyuan Xu, Xiaozhu Ju, Zhengping Che, Chi Harold Liu, Jian Tang,
- Abstract summary: Existing data augmentation methods disrupt task-relevant information in images, potentially degrading performance.<n>We introduce an efficient training framework for generalizable visuomotor policies that improves upon existing methods.<n>We conduct experiments on three domains, including the DMControl Generalization Benchmark, the enhanced Robot Manipulation Distraction Benchmark, and a long-sequential drawer-opening task.
- Score: 45.62430292728745
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improving generalization is one key challenge in embodied AI, where obtaining large-scale datasets across diverse scenarios is costly. Traditional weak augmentations, such as cropping and flipping, are insufficient for improving a model's performance in new environments. Existing data augmentation methods often disrupt task-relevant information in images, potentially degrading performance. To overcome these challenges, we introduce EAGLE, an efficient training framework for generalizable visuomotor policies that improves upon existing methods by (1) enhancing generalization by applying augmentation only to control-related regions identified through a self-supervised control-aware mask and (2) improving training stability and efficiency by distilling knowledge from an expert to a visuomotor student policy, which is then deployed to unseen environments without further fine-tuning. Comprehensive experiments on three domains, including the DMControl Generalization Benchmark, the enhanced Robot Manipulation Distraction Benchmark, and a long-sequential drawer-opening task, validate the effectiveness of our method.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator [14.77553682217217]
We introduce a Contrastive Spectral Koopman Embedding network that allows us to learn efficient linearized visual representations from the agent's visual data in a high dimensional latent space.
Our method enhances stability and control in gradient dynamics over time, significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-09-04T22:14:59Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction [4.813546138483559]
Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data.
In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods.
We propose a novel method, Disentangled Environment and Agent Representations, that uses the segmentation mask of the agent as supervision.
arXiv Detail & Related papers (2024-06-30T09:15:21Z) - Foundation Policies with Hilbert Representations [54.44869979017766]
We propose an unsupervised framework to pre-train generalist policies from unlabeled offline data.
Our key insight is to learn a structured representation that preserves the temporal structure of the underlying environment.
Our experiments show that our unsupervised policies can solve goal-conditioned and general RL tasks in a zero-shot fashion.
arXiv Detail & Related papers (2024-02-23T19:09:10Z) - Hierarchical Transformers are Efficient Meta-Reinforcement Learners [19.79721574250755]
We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a powerful online meta-reinforcement learning approach.
We demonstrate how past episodes serve as a rich source of information, which our model effectively distills and applies to new contexts.
arXiv Detail & Related papers (2024-02-09T13:40:11Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Normalization Enhances Generalization in Visual Reinforcement Learning [20.04754884180226]
normalization techniques have demonstrated huge success in supervised and unsupervised learning.
We find that incorporating suitable normalization techniques is sufficient to enhance the generalization capabilities.
Our method significantly improves generalization capability while only marginally affecting sample efficiency.
arXiv Detail & Related papers (2023-06-01T13:24:56Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for
Visual Reinforcement Learning [27.205521177841568]
We propose Task-aware Lipschitz Data Augmentation (TLDA) for visual Reinforcement Learning (RL)
TLDA explicitly identifies the task-correlated pixels with large Lipschitz constants, and only augments the task-irrelevant pixels.
It outperforms previous state-of-the-art methods across the 3 different visual control benchmarks.
arXiv Detail & Related papers (2022-02-21T04:22:07Z) - Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under
Data Augmentation [25.493902939111265]
We investigate causes of instability when using data augmentation in off-policy Reinforcement Learning algorithms.
We propose a simple yet effective technique for stabilizing this class of algorithms under augmentation.
Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL.
arXiv Detail & Related papers (2021-07-01T17:58:05Z) - Generalization of Reinforcement Learning with Policy-Aware Adversarial
Data Augmentation [32.70482982044965]
We propose a novel policy-aware adversarial data augmentation method to augment the standard policy learning method with automatically generated trajectory data.
We conduct experiments on a number of RL tasks to investigate the generalization performance of the proposed method.
The results show our method can generalize well with limited training diversity, and achieve the state-of-the-art generalization test performance.
arXiv Detail & Related papers (2021-06-29T17:21:59Z) - SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual
Policies [87.78260740602674]
Generalization has been a long-standing challenge for reinforcement learning (RL)
In this work, we consider robust policy learning which targets zero-shot generalization to unseen visual environments with large distributional shift.
We propose SECANT, a novel self-expert cloning technique that leverages image augmentation in two stages to decouple robust representation learning from policy optimization.
arXiv Detail & Related papers (2021-06-17T17:28:18Z) - Residual Reinforcement Learning from Demonstrations [51.56457466788513]
Residual reinforcement learning (RL) has been proposed as a way to solve challenging robotic tasks by adapting control actions from a conventional feedback controller to maximize a reward signal.
We extend the residual formulation to learn from visual inputs and sparse rewards using demonstrations.
Our experimental evaluation on simulated manipulation tasks on a 6-DoF UR5 arm and a 28-DoF dexterous hand demonstrates that residual RL from demonstrations is able to generalize to unseen environment conditions more flexibly than either behavioral cloning or RL fine-tuning.
arXiv Detail & Related papers (2021-06-15T11:16:49Z) - Robust Deep Reinforcement Learning via Multi-View Information Bottleneck [7.188571996124112]
We introduce an auxiliary objective based on the multi-view information bottleneck (MIB) principle.
This encourages learning representations that are both predictive of the future and less sensitive to task-irrelevant distractions.
We demonstrate that our approach can achieve SOTA performance on challenging visual control tasks, even when the background is replaced with natural videos.
arXiv Detail & Related papers (2021-02-26T02:24:36Z) - Generalization in Reinforcement Learning by Soft Data Augmentation [11.752595047069505]
SOft Data Augmentation (SODA) is a method that decouples augmentation from policy learning.
We find SODA to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.
arXiv Detail & Related papers (2020-11-26T17:00:34Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.