Related papers: Improving Generalization in Reinforcement Learning with Mixture Regularization

Improving Generalization in Reinforcement Learning with Mixture Regularization

URL: http://arxiv.org/abs/2010.10814v1
Date: Wed, 21 Oct 2020 08:12:03 GMT
Title: Improving Generalization in Reinforcement Learning with Mixture Regularization
Authors: Kaixin Wang, Bingyi Kang, Jie Shao, Jiashi Feng
Abstract summary: We introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments. Mixreg increases the data diversity more effectively and helps learn smoother policies. Results show mixreg outperforms the well-established baselines on unseen testing environments by a large margin.
Score: 113.12412071717078
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning (RL) agents trained in a limited set of environments tend to suffer overfitting and fail to generalize to unseen testing environments. To improve their generalizability, data augmentation approaches (e.g. cutout and random convolution) are previously explored to increase the data diversity. However, we find these approaches only locally perturb the observations regardless of the training environments, showing limited effectiveness on enhancing the data diversity and the generalization performance. In this work, we introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments and imposes linearity constraints on the observation interpolations and the supervision (e.g. associated reward) interpolations. Mixreg increases the data diversity more effectively and helps learn smoother policies. We verify its effectiveness on improving generalization by conducting extensive experiments on the large-scale Procgen benchmark. Results show mixreg outperforms the well-established baselines on unseen testing environments by a large margin. Mixreg is simple, effective and general. It can be applied to both policy-based and value-based RL algorithms. Code is available at https://github.com/kaixin96/mixreg .

Related papers

Task Diversity in Bayesian Federated Learning: Simultaneous Processing of Classification and Regression [18.522115769904477]
We propose a principled integration of multi-task learning using multi-output Gaussian processes (MOGP) at the local level and federated learning at the global level. Challenges in performing posterior inference on local devices are addressed through the P'o'lya-Gamma augmentation technique and mean-field variational inference. Experimental results on both synthetic and real data demonstrate superior predictive performance, OOD detection, uncertainty calibration and convergence rate.
arXiv Detail & Related papers (2024-12-14T17:10:54Z)
Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning [10.00979536266327]
We present a novel approach to address the challenge of generalization in offline reinforcement learning (RL) Specifically, we aim to improve the agent's ability to generalize to out-of-distribution goals. We learn a new policy offline based on the augmented dataset, with an off-the-shelf offline RL algorithm.
arXiv Detail & Related papers (2023-09-14T10:22:33Z)
Expeditious Saliency-guided Mix-up through Random Gradient Thresholding [89.59134648542042]
Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks. In this paper, inspired by the superior qualities of each direction over one another, we introduce a novel method that lies at the junction of the two routes. We name our method R-Mix following the concept of "Random Mix-up" In order to address the question of whether there exists a better decision protocol, we train a Reinforcement Learning agent that decides the mix-up policies.
arXiv Detail & Related papers (2022-12-09T14:29:57Z)
FIXED: Frustratingly Easy Domain Generalization with Mixup [53.782029033068675]
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains. A popular strategy is to augment training data to benefit generalization through methods such as Mixupcitezhang 2018mixup. We propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX) Our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5% on average in terms of test accuracy.
arXiv Detail & Related papers (2022-11-07T09:38:34Z)
C-Mixup: Improving Generalization in Regression [71.10418219781575]
Mixup algorithm improves generalization by linearly interpolating a pair of examples and their corresponding labels. We propose C-Mixup, which adjusts the sampling probability based on the similarity of the labels. C-Mixup achieves 6.56%, 4.76%, 5.82% improvements in in-distribution generalization, task generalization, and out-of-distribution robustness, respectively.
arXiv Detail & Related papers (2022-10-11T20:39:38Z)
Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data. In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM) DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z)
Implicit Gradient Alignment in Distributed and Federated Learning [39.61762498388211]
A major obstacle to achieving global convergence in distributed and federated learning is misalignment of gradients across clients. We propose a novel GradAlign algorithm that induces the same implicit regularization while allowing the use of arbitrarily large batches in each update.
arXiv Detail & Related papers (2021-06-25T22:01:35Z)
MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning [2.1345682889327837]
Existing techniques for data augmentation largely focus on classification tasks and do not readily apply to regression tasks. We show that mixing examples that either have a large data or label distance may have an increasingly-negative effect on model performance. We propose MixRL, a data augmentation meta learning framework for regression that learns for each example how many nearest neighbors it should be mixed with for the best model performance.
arXiv Detail & Related papers (2021-06-07T07:01:39Z)
k-Mixup Regularization for Deep Learning via Optimal Transport [32.951696405505686]
Mixup is a popular regularization technique for training deep neural networks. We extend mixup in a simple, broadly applicable way to emph$k$-mixup, which perturbs $k$-batches of training points in the direction of other $k$-batches. We show that training with $k$-mixup further improves generalization and robustness across several network architectures.
arXiv Detail & Related papers (2021-06-05T17:08:08Z)
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity [15.780905917870427]
We propose a new perspective on batch mixup and formulate the optimal construction of a batch of mixup data. We also propose an efficient modular approximation based iterative submodular computation algorithm for efficient mixup per each minibatch. Our experiments show the proposed method achieves the state of the art generalization, calibration, and weakly supervised localization results.
arXiv Detail & Related papers (2021-02-05T09:12:02Z)
Unshuffling Data for Improved Generalization [65.57124325257409]
Generalization beyond the training distribution is a core challenge in machine learning. We show that partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple training environments can guide the learning of models with better out-of-distribution generalization.
arXiv Detail & Related papers (2020-02-27T03:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.