Related papers: Learning perturbation sets for robust machine learning

Learning perturbation sets for robust machine learning

URL: http://arxiv.org/abs/2007.08450v2
Date: Thu, 8 Oct 2020 13:03:48 GMT
Title: Learning perturbation sets for robust machine learning
Authors: Eric Wong and J. Zico Kolter
Abstract summary: We use a conditional generator that defines the perturbation set over a constrained region of the latent space. We measure the quality of our learned perturbation sets both quantitatively and qualitatively. We leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations.
Score: 97.6757418136662
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a conditional generator that defines the perturbation set over a constrained region of the latent space. We formulate desirable properties that measure the quality of a learned perturbation set, and theoretically prove that a conditional variational autoencoder naturally satisfies these criteria. Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning.

Related papers

Understanding Generalization in Transformers: Error Bounds and Training Dynamics Under Benign and Harmful Overfitting [36.149708427591534]
We develop a generalization theory for a two-layer transformer with labeled flip noise. We present generalization error bounds for both benign and harmful overfitting under varying signal-to-noise ratios. We conduct extensive experiments to identify key factors that influence test errors in transformers.
arXiv Detail & Related papers (2025-02-18T03:46:01Z)
MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations. We show that strong feature representation learning during training can significantly enhance the original model's robustness. We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Robustness and invariance properties of image classifiers [8.970032486260695]
Deep neural networks have achieved impressive results in many image classification tasks. Deep networks are not robust to a large variety of semantic-preserving image modifications. The poor robustness of image classifiers to small data distribution shifts raises serious concerns regarding their trustworthiness.
arXiv Detail & Related papers (2022-08-30T11:00:59Z)
Stabilizing Adversarially Learned One-Class Novelty Detection Using Pseudo Anomalies [22.48845887819345]
anomaly scores have been formulated using reconstruction loss of the adversarially learned generators and/or classification loss of discriminators. Unavailability of anomaly examples in the training data makes optimization of such networks challenging. We propose a robust anomaly detection framework that overcomes such instability by transforming the fundamental role of the discriminator from identifying real vs. fake data to distinguishing good vs. bad quality reconstructions.
arXiv Detail & Related papers (2022-03-25T15:37:52Z)
Is Disentanglement enough? On Latent Representations for Controllable Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability. The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z)
Natural Perturbed Training for General Robustness of Neural Network Classifiers [0.0]
Natural perturbed learning show better and much faster performance than adversarial training on clean, adversarial as well as natural perturbed images. For Cifar-10 and STL-10 natural perturbed training even improves the accuracy for clean data and reaches the state of the art performance.
arXiv Detail & Related papers (2021-03-21T11:47:38Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map. We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z)
Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data [104.69689574851724]
We propose a paradigm shift from perturbation-based adversarial robustness toward model-based robust deep learning. Our objective is to provide general training algorithms that can be used to train deep neural networks to be robust against natural variation in data.
arXiv Detail & Related papers (2020-05-20T13:46:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.