Adversarial Examples for Unsupervised Machine Learning Models
- URL: http://arxiv.org/abs/2103.01895v1
- Date: Tue, 2 Mar 2021 17:47:58 GMT
- Title: Adversarial Examples for Unsupervised Machine Learning Models
- Authors: Chia-Yi Hsu, Pin-Yu Chen, Songtao Lu, Sijia Lu, Chia-Mu Yu
- Abstract summary: Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models.
We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
- Score: 71.81480647638529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples causing evasive predictions are widely used to evaluate
and improve the robustness of machine learning models. However, current studies
on adversarial examples focus on supervised learning tasks, relying on the
ground-truth data label, a targeted objective, or supervision from a trained
classifier. In this paper, we propose a framework of generating adversarial
examples for unsupervised models and demonstrate novel applications to data
augmentation. Our framework exploits a mutual information neural estimator as
an information-theoretic similarity measure to generate adversarial examples
without supervision. We propose a new MinMax algorithm with provable
convergence guarantees for efficient generation of unsupervised adversarial
examples. Our framework can also be extended to supervised adversarial
examples. When using unsupervised adversarial examples as a simple plug-in data
augmentation tool for model retraining, significant improvements are
consistently observed across different unsupervised tasks and datasets,
including data reconstruction, representation learning, and contrastive
learning. Our results show novel methods and advantages in studying and
improving robustness of unsupervised learning problems via adversarial
examples. Our codes are available at https://github.com/IBM/UAE.
Related papers
- Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Unsupervised Detection of Adversarial Examples with Model Explanations [0.6091702876917279]
We propose a simple yet effective method to detect adversarial examples using methods developed to explain the model's behavior.
Our evaluations with MNIST handwritten dataset show that our method is capable of detecting adversarial examples with high confidence.
arXiv Detail & Related papers (2021-07-22T06:54:18Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Understanding Robustness in Teacher-Student Setting: A New Perspective [42.746182547068265]
Adrial examples are machine learning models where bounded adversarial perturbation could mislead the models to make arbitrarily incorrect predictions.
Extensive studies try to explain the existence of adversarial examples and provide ways to improve model robustness.
Our studies could shed light on the future exploration about adversarial examples, and enhancing model robustness via principled data augmentation.
arXiv Detail & Related papers (2021-02-25T20:54:24Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Adversarial Machine Learning in Network Intrusion Detection Systems [6.18778092044887]
We study the nature of the adversarial problem in Network Intrusion Detection Systems.
We use evolutionary computation (particle swarm optimization and genetic algorithm) and deep learning (generative adversarial networks) as tools for adversarial example generation.
Our work highlights the vulnerability of machine learning based NIDS in the face of adversarial perturbation.
arXiv Detail & Related papers (2020-04-23T19:47:43Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.