Semantic Perturbations with Normalizing Flows for Improved
Generalization
- URL: http://arxiv.org/abs/2108.07958v1
- Date: Wed, 18 Aug 2021 03:20:00 GMT
- Title: Semantic Perturbations with Normalizing Flows for Improved
Generalization
- Authors: Oguz Kaan Yuksel, Sebastian U. Stich, Martin Jaggi, Tatjana Chavdarova
- Abstract summary: We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
- Score: 62.998818375912506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation is a widely adopted technique for avoiding overfitting when
training deep neural networks. However, this approach requires domain-specific
knowledge and is often limited to a fixed set of hard-coded transformations.
Recently, several works proposed to use generative models for generating
semantically meaningful perturbations to train a classifier. However, because
accurate encoding and decoding are critical, these methods, which use
architectures that approximate the latent-variable inference, remained limited
to pilot studies on small datasets.
Exploiting the exactly reversible encoder-decoder structure of normalizing
flows, we perform on-manifold perturbations in the latent space to define fully
unsupervised data augmentations. We demonstrate that such perturbations match
the performance of advanced data augmentation techniques -- reaching 96.6% test
accuracy for CIFAR-10 using ResNet-18 and outperform existing methods,
particularly in low data regimes -- yielding 10--25% relative improvement of
test accuracy from classical training. We find that our latent adversarial
perturbations adaptive to the classifier throughout its training are most
effective, yielding the first test accuracy improvement results on real-world
datasets -- CIFAR-10/100 -- via latent-space perturbations.
Related papers
- What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration.
We identify the critical limitations of regression-based methods with the widely used data generation pipeline.
We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z) - Leveraging Semi-Supervised Learning to Enhance Data Mining for Image Classification under Limited Labeled Data [35.431340001608476]
Traditional data mining methods are inadequate when faced with large-scale, high-dimensional and complex data.
This study introduces semi-supervised learning methods, aiming to improve the algorithm's ability to utilize unlabeled data.
Specifically, we adopt a self-training method and combine it with a convolutional neural network (CNN) for image feature extraction and classification.
arXiv Detail & Related papers (2024-11-27T18:59:50Z) - Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Augmentation-Aware Self-Supervision for Data-Efficient GAN Training [68.81471633374393]
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
We propose a novel augmentation-aware self-supervised discriminator that predicts the augmentation parameter of the augmented data.
We compare our method with state-of-the-art (SOTA) methods using the class-conditional BigGAN and unconditional StyleGAN2 architectures.
arXiv Detail & Related papers (2022-05-31T10:35:55Z) - Rethinking Reconstruction Autoencoder-Based Out-of-Distribution
Detection [0.0]
Reconstruction autoencoder-based methods deal with the problem by using input reconstruction error as a metric of novelty vs. normality.
We introduce semantic reconstruction, data certainty decomposition and normalized L2 distance to substantially improve original methods.
Our method works without any additional data, hard-to-implement structure, time-consuming pipeline, and even harming the classification accuracy of known classes.
arXiv Detail & Related papers (2022-03-04T09:04:55Z) - Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
Learning [43.51182049644767]
Semi-supervised learning (SSL) has long been proved to be an effective technique to construct powerful models with limited labels.
Regularization-based methods which force the perturbed samples to have similar predictions with the original ones have attracted much attention.
We propose a novel contrastive loss to guide the embedding of the learned network to change linearly between samples.
arXiv Detail & Related papers (2022-02-24T06:00:05Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Passive Batch Injection Training Technique: Boosting Network Performance
by Injecting Mini-Batches from a different Data Distribution [39.8046809855363]
This work presents a novel training technique for deep neural networks that makes use of additional data from a distribution that is different from that of the original input data.
To the best of our knowledge, this is the first work that makes use of different data distribution to aid the training of convolutional neural networks (CNNs)
arXiv Detail & Related papers (2020-06-08T08:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.