Robust Sampling in Deep Learning
- URL: http://arxiv.org/abs/2006.02734v2
- Date: Fri, 5 Jun 2020 10:37:37 GMT
- Title: Robust Sampling in Deep Learning
- Authors: Aurora Cobo Aguilera, Antonio Art\'es-Rodr\'iguez, Fernando
P\'erez-Cruz, Pablo Mart\'inez Olmos
- Abstract summary: Deep learning requires regularization mechanisms to reduce overfitting and improve generalization.
We address this problem by a new regularization method based on distributional robust optimization.
During the training, the selection of samples is done according to their accuracy in such a way that the worst performed samples are the ones that contribute the most in the optimization.
- Score: 62.997667081978825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning requires regularization mechanisms to reduce overfitting and
improve generalization. We address this problem by a new regularization method
based on distributional robust optimization. The key idea is to modify the
contribution from each sample for tightening the empirical risk bound. During
the stochastic training, the selection of samples is done according to their
accuracy in such a way that the worst performed samples are the ones that
contribute the most in the optimization. We study different scenarios and show
the ones where it can make the convergence faster or increase the accuracy.
Related papers
- Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Boosting Fair Classifier Generalization through Adaptive Priority Reweighing [59.801444556074394]
A performance-promising fair algorithm with better generalizability is needed.
This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability.
arXiv Detail & Related papers (2023-09-15T13:04:55Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Unified Regularity Measures for Sample-wise Learning and Generalization [18.10522585996242]
We propose a pair of sample regularity measures for both processes with a formulation-consistent representation.
Experiments validated the effectiveness and robustness of the proposed approaches for mini-batch SGD optimization.
arXiv Detail & Related papers (2021-08-09T10:11:14Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - Optimal Importance Sampling for Federated Learning [57.14673504239551]
Federated learning involves a mixture of centralized and decentralized processing tasks.
The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling.
We derive optimal importance sampling strategies for both agent and data selection and show that non-uniform sampling without replacement improves the performance of the original FedAvg algorithm.
arXiv Detail & Related papers (2020-10-26T14:15:33Z) - Effective Proximal Methods for Non-convex Non-smooth Regularized
Learning [27.775096437736973]
We show that the independent sampling scheme tends to improve performance of the commonly-used uniform sampling scheme.
Our new analysis also derives a speed for the sampling than best one available so far.
arXiv Detail & Related papers (2020-09-14T16:41:32Z) - Variance Regularization for Accelerating Stochastic Optimization [14.545770519120898]
We propose a universal principle which reduces the random error accumulation by exploiting statistic information hidden in mini-batch gradients.
This is achieved by regularizing the learning-rate according to mini-batch variances.
arXiv Detail & Related papers (2020-08-13T15:34:01Z) - Incremental Without Replacement Sampling in Nonconvex Optimization [0.0]
Minibatch decomposition methods for empirical risk are commonly analysed in an approximation setting, also known as sampling with replacement.
On the other hands modern implementations of such techniques are incremental: they rely on sampling without replacement, for which available analysis are much scarcer.
We provide convergence guaranties for the latter variant by analysing a versatile incremental gradient scheme.
arXiv Detail & Related papers (2020-07-15T09:17:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.