Semantic Perturbations with Normalizing Flows for Improved
Generalization
- URL: http://arxiv.org/abs/2108.07958v1
- Date: Wed, 18 Aug 2021 03:20:00 GMT
- Title: Semantic Perturbations with Normalizing Flows for Improved
Generalization
- Authors: Oguz Kaan Yuksel, Sebastian U. Stich, Martin Jaggi, Tatjana Chavdarova
- Abstract summary: We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
- Score: 62.998818375912506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation is a widely adopted technique for avoiding overfitting when
training deep neural networks. However, this approach requires domain-specific
knowledge and is often limited to a fixed set of hard-coded transformations.
Recently, several works proposed to use generative models for generating
semantically meaningful perturbations to train a classifier. However, because
accurate encoding and decoding are critical, these methods, which use
architectures that approximate the latent-variable inference, remained limited
to pilot studies on small datasets.
Exploiting the exactly reversible encoder-decoder structure of normalizing
flows, we perform on-manifold perturbations in the latent space to define fully
unsupervised data augmentations. We demonstrate that such perturbations match
the performance of advanced data augmentation techniques -- reaching 96.6% test
accuracy for CIFAR-10 using ResNet-18 and outperform existing methods,
particularly in low data regimes -- yielding 10--25% relative improvement of
test accuracy from classical training. We find that our latent adversarial
perturbations adaptive to the classifier throughout its training are most
effective, yielding the first test accuracy improvement results on real-world
datasets -- CIFAR-10/100 -- via latent-space perturbations.
Related papers
- Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Latent Enhancing AutoEncoder for Occluded Image Classification [2.6217304977339473]
We introduce LEARN: Latent Enhancing feAture Reconstruction Network.
An auto-encoder based network that can be incorporated into the classification model before its head.
On the OccludedPASCAL3D+ dataset, the proposed LEARN outperforms standard classification models.
arXiv Detail & Related papers (2024-02-10T12:22:31Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - Augmentation-Aware Self-Supervision for Data-Efficient GAN Training [68.81471633374393]
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
We propose a novel augmentation-aware self-supervised discriminator that predicts the augmentation parameter of the augmented data.
We compare our method with state-of-the-art (SOTA) methods using the class-conditional BigGAN and unconditional StyleGAN2 architectures.
arXiv Detail & Related papers (2022-05-31T10:35:55Z) - Rethinking Reconstruction Autoencoder-Based Out-of-Distribution
Detection [0.0]
Reconstruction autoencoder-based methods deal with the problem by using input reconstruction error as a metric of novelty vs. normality.
We introduce semantic reconstruction, data certainty decomposition and normalized L2 distance to substantially improve original methods.
Our method works without any additional data, hard-to-implement structure, time-consuming pipeline, and even harming the classification accuracy of known classes.
arXiv Detail & Related papers (2022-03-04T09:04:55Z) - Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
Learning [43.51182049644767]
Semi-supervised learning (SSL) has long been proved to be an effective technique to construct powerful models with limited labels.
Regularization-based methods which force the perturbed samples to have similar predictions with the original ones have attracted much attention.
We propose a novel contrastive loss to guide the embedding of the learned network to change linearly between samples.
arXiv Detail & Related papers (2022-02-24T06:00:05Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Passive Batch Injection Training Technique: Boosting Network Performance
by Injecting Mini-Batches from a different Data Distribution [39.8046809855363]
This work presents a novel training technique for deep neural networks that makes use of additional data from a distribution that is different from that of the original input data.
To the best of our knowledge, this is the first work that makes use of different data distribution to aid the training of convolutional neural networks (CNNs)
arXiv Detail & Related papers (2020-06-08T08:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.