Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial
Attacks
- URL: http://arxiv.org/abs/2103.06701v1
- Date: Wed, 10 Mar 2021 14:23:20 GMT
- Title: Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial
Attacks
- Authors: Anna Kuzina, Max Welling, Jakub M. Tomczak
- Abstract summary: We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack)
We examine the influence of model modifications on the robustness of VAEs and suggest metrics to quantify it.
- Score: 80.73580820014242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we explore adversarial attacks on the Variational Autoencoders
(VAE). We show how to modify data point to obtain a prescribed latent code
(supervised attack) or just get a drastically different code (unsupervised
attack). We examine the influence of model modifications ($\beta$-VAE, NVAE) on
the robustness of VAEs and suggest metrics to quantify it.
Related papers
- On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior.
We propose textitAutoPoison, an automated data poisoning pipeline.
Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Defending Variational Autoencoders from Adversarial Attacks with MCMC [74.36233246536459]
Variational autoencoders (VAEs) are deep generative models used in various domains.
As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input.
Here, we examine several objective functions for adversarial attacks construction, suggest metrics assess the model robustness, and propose a solution.
arXiv Detail & Related papers (2022-03-18T13:25:18Z) - AAVAE: Augmentation-Augmented Variational Autoencoders [43.73699420145321]
We introduce augmentation-augmented variational autoencoders (AAVAE), a third approach to self-supervised learning based on autoencoding.
We empirically evaluate the proposed AAVAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated.
arXiv Detail & Related papers (2021-07-26T17:04:30Z) - Self-Supervised Adversarial Example Detection by Disentangled
Representation [16.98476232162835]
We train an autoencoder, assisted by a discriminator network, over both correctly paired class/semantic features and incorrectly paired class/semantic features to reconstruct benign and counterexamples.
This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder.
Compared with the state-of-the-art self-supervised detection methods, our method exhibits better performance in various measurements.
arXiv Detail & Related papers (2021-05-08T12:48:18Z) - Hierarchical Variational Autoencoder for Visual Counterfactuals [79.86967775454316]
Conditional Variational Autos (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool.
In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals.
We introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.
arXiv Detail & Related papers (2021-02-01T14:07:11Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Double Backpropagation for Training Autoencoders against Adversarial
Attack [15.264115499966413]
This paper focuses on the adversarial attack on autoencoders.
We propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW.
arXiv Detail & Related papers (2020-03-04T05:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.