Learning Augmentation Distributions using Transformed Risk Minimization
- URL: http://arxiv.org/abs/2111.08190v2
- Date: Thu, 5 Oct 2023 22:18:23 GMT
- Title: Learning Augmentation Distributions using Transformed Risk Minimization
- Authors: Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Kostas Daniilidis,
Edgar Dobriban
- Abstract summary: We propose a new emphTransformed Risk Minimization (TRM) framework as an extension of classical risk minimization.
As a key application, we focus on learning augmentations to improve classification performance with a given class of predictors.
- Score: 47.236227685707526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new \emph{Transformed Risk Minimization} (TRM) framework as an
extension of classical risk minimization. In TRM, we optimize not only over
predictive models, but also over data transformations; specifically over
distributions thereof. As a key application, we focus on learning
augmentations; for instance appropriate rotations of images, to improve
classification performance with a given class of predictors. Our TRM method (1)
jointly learns transformations and models in a \emph{single training loop}, (2)
works with any training algorithm applicable to standard risk minimization, and
(3) handles any transforms, such as discrete and continuous classes of
augmentations. To avoid overfitting when implementing empirical transformed
risk minimization, we propose a novel regularizer based on PAC-Bayes theory.
For learning augmentations of images, we propose a new parametrization of the
space of augmentations via a stochastic composition of blocks of geometric
transforms. This leads to the new \emph{Stochastic Compositional Augmentation
Learning} (SCALE) algorithm. The performance of TRM with SCALE compares
favorably to prior methods on CIFAR10/100. Additionally, we show empirically
that SCALE can correctly learn certain symmetries in the data distribution
(recovering rotations on rotated MNIST) and can also improve calibration of the
learned model.
Related papers
- A Model-Based Method for Minimizing CVaR and Beyond [7.751691910877239]
We develop a variant of the prox-linear method for minimizing the Conditional Value-at-Risk (CVaR) objective.
CVaR is a risk measure focused on minimizing worst-case performance, defined as the average of the top quantile of the losses.
In machine learning, such a risk measure is useful to train more robust models.
arXiv Detail & Related papers (2023-05-27T15:38:53Z) - Learning Optimal Features via Partial Invariance [18.552839725370383]
Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments.
We show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $textitpartial invariance$.
Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
arXiv Detail & Related papers (2023-01-28T02:48:14Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - On the Minimal Error of Empirical Risk Minimization [90.09093901700754]
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
arXiv Detail & Related papers (2021-02-24T04:47:55Z) - Model Adaptation for Image Reconstruction using Generalized Stein's
Unbiased Risk Estimator [34.08815401541628]
We introduce a Generalized Stein's Unbiased Risk Estimate (GSURE) loss metric to adapt the network to the measured k-space data.
Unlike current methods that rely on the mean square error in kspace, the proposed metric accounts for noise in the measurements.
arXiv Detail & Related papers (2021-01-29T20:16:45Z) - Learning Rates as a Function of Batch Size: A Random Matrix Theory
Approach to Neural Network Training [2.9649783577150837]
We study the effect of mini-batching on the loss landscape of deep neural networks using spiked, field-dependent random matrix theory.
We derive analytical expressions for the maximal descent and adaptive training regimens for smooth, non-Newton deep neural networks.
We validate our claims on the VGG/ResNet and ImageNet datasets.
arXiv Detail & Related papers (2020-06-16T11:55:45Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.