Normalization Enhances Generalization in Visual Reinforcement Learning
- URL: http://arxiv.org/abs/2306.00656v1
- Date: Thu, 1 Jun 2023 13:24:56 GMT
- Title: Normalization Enhances Generalization in Visual Reinforcement Learning
- Authors: Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li,
Zhiheng Li
- Abstract summary: normalization techniques have demonstrated huge success in supervised and unsupervised learning.
We find that incorporating suitable normalization techniques is sufficient to enhance the generalization capabilities.
Our method significantly improves generalization capability while only marginally affecting sample efficiency.
- Score: 20.04754884180226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in visual reinforcement learning (RL) have led to impressive
success in handling complex tasks. However, these methods have demonstrated
limited generalization capability to visual disturbances, which poses a
significant challenge for their real-world application and adaptability. Though
normalization techniques have demonstrated huge success in supervised and
unsupervised learning, their applications in visual RL are still scarce. In
this paper, we explore the potential benefits of integrating normalization into
visual RL methods with respect to generalization performance. We find that,
perhaps surprisingly, incorporating suitable normalization techniques is
sufficient to enhance the generalization capabilities, without any additional
special design. We utilize the combination of two normalization techniques,
CrossNorm and SelfNorm, for generalizable visual RL. Extensive experiments are
conducted on DMControl Generalization Benchmark and CARLA to validate the
effectiveness of our method. We show that our method significantly improves
generalization capability while only marginally affecting sample efficiency. In
particular, when integrated with DrQ-v2, our method enhances the test
performance of DrQ-v2 on CARLA across various scenarios, from 14% of the
training performance to 97%.
Related papers
- A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning [12.889687274108248]
A Q-learning algorithm is prone to overfitting and training instabilities when trained from visual observations.
We propose a generalized recipe, SADA, that works with wider varieties of augmentations.
We find that our method, SADA, greatly improves training stability and generalization of RL agents across a diverse set of augmentations.
arXiv Detail & Related papers (2024-05-27T17:58:23Z) - IMEX-Reg: Implicit-Explicit Regularization in the Function Space for Continual Learning [17.236861687708096]
Continual learning (CL) remains one of the long-standing challenges for deep neural networks due to catastrophic forgetting of previously acquired knowledge.
Inspired by how humans learn using strong inductive biases, we propose IMEX-Reg to improve the generalization performance of experience rehearsal in CL under low buffer regimes.
arXiv Detail & Related papers (2024-04-28T12:25:09Z) - Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Local Feature Swapping for Generalization in Reinforcement Learning [0.0]
We introduce a new regularization technique consisting of channel-consistent local permutations (CLOP) of the feature maps.
The proposed permutations induce robustness to spatial correlations and help prevent overfitting behaviors in reinforcement learning.
We demonstrate, on the OpenAI Procgen Benchmark, that RL agents trained with the CLOP method exhibit robustness to visual changes and better generalization properties.
arXiv Detail & Related papers (2022-04-13T13:12:51Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit
Partial Observability [92.95794652625496]
Generalization is a central challenge for the deployment of reinforcement learning systems.
We show that generalization to unseen test conditions from a limited number of training conditions induces implicit partial observability.
We recast the problem of generalization in RL as solving the induced partially observed Markov decision process.
arXiv Detail & Related papers (2021-07-13T17:59:25Z) - Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under
Data Augmentation [25.493902939111265]
We investigate causes of instability when using data augmentation in off-policy Reinforcement Learning algorithms.
We propose a simple yet effective technique for stabilizing this class of algorithms under augmentation.
Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL.
arXiv Detail & Related papers (2021-07-01T17:58:05Z) - How to Make Deep RL Work in Practice [15.740760669623876]
Reported results of state-of-the-art algorithms are often difficult to reproduce.
We make suggestions which of those techniques to use by default and highlight areas that could benefit from a solution specifically tailored to RL.
arXiv Detail & Related papers (2020-10-25T10:37:54Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.