CROP: Towards Distributional-Shift Robust Reinforcement Learning using
Compact Reshaped Observation Processing
- URL: http://arxiv.org/abs/2304.13616v2
- Date: Tue, 5 Dec 2023 13:16:23 GMT
- Title: CROP: Towards Distributional-Shift Robust Reinforcement Learning using
Compact Reshaped Observation Processing
- Authors: Philipp Altmann, Fabian Ritz, Leonard Feuchtinger, Jonas
N\"u{\ss}lein, Claudia Linnhoff-Popien, Thomy Phan
- Abstract summary: Current approaches for generalization apply data augmentation techniques to increase the diversity of training data.
Crafting a suitable observation, only containing crucial information, has been shown to be a challenging task itself.
We propose Compact Reshaped Observation Processing (CROP) to reduce the state information used for policy optimization.
- Score: 8.569762036154799
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The safe application of reinforcement learning (RL) requires generalization
from limited training data to unseen scenarios. Yet, fulfilling tasks under
changing circumstances is a key challenge in RL. Current state-of-the-art
approaches for generalization apply data augmentation techniques to increase
the diversity of training data. Even though this prevents overfitting to the
training environment(s), it hinders policy optimization. Crafting a suitable
observation, only containing crucial information, has been shown to be a
challenging task itself. To improve data efficiency and generalization
capabilities, we propose Compact Reshaped Observation Processing (CROP) to
reduce the state information used for policy optimization. By providing only
relevant information, overfitting to a specific training layout is precluded
and generalization to unseen environments is improved. We formulate three CROPs
that can be applied to fully observable observation- and action-spaces and
provide methodical foundation. We empirically show the improvements of CROP in
a distributionally shifted safety gridworld. We furthermore provide benchmark
comparisons to full observability and data-augmentation in two different-sized
procedurally generated mazes.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Equivariant Data Augmentation for Generalization in Offline
Reinforcement Learning [10.00979536266327]
We present a novel approach to address the challenge of generalization in offline reinforcement learning (RL)
Specifically, we aim to improve the agent's ability to generalize to out-of-distribution goals.
We learn a new policy offline based on the augmented dataset, with an off-the-shelf offline RL algorithm.
arXiv Detail & Related papers (2023-09-14T10:22:33Z) - A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment.
We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy.
Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z) - Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner.
We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z) - Generalization of Reinforcement Learning with Policy-Aware Adversarial
Data Augmentation [32.70482982044965]
We propose a novel policy-aware adversarial data augmentation method to augment the standard policy learning method with automatically generated trajectory data.
We conduct experiments on a number of RL tasks to investigate the generalization performance of the proposed method.
The results show our method can generalize well with limited training diversity, and achieve the state-of-the-art generalization test performance.
arXiv Detail & Related papers (2021-06-29T17:21:59Z) - Generalization in Reinforcement Learning by Soft Data Augmentation [11.752595047069505]
SOft Data Augmentation (SODA) is a method that decouples augmentation from policy learning.
We find SODA to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.
arXiv Detail & Related papers (2020-11-26T17:00:34Z) - Improving Generalization in Reinforcement Learning with Mixture
Regularization [113.12412071717078]
We introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments.
Mixreg increases the data diversity more effectively and helps learn smoother policies.
Results show mixreg outperforms the well-established baselines on unseen testing environments by a large margin.
arXiv Detail & Related papers (2020-10-21T08:12:03Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.