Related papers: Generating Out of Distribution Adversarial Attack using Latent Space Poisoning

Generating Out of Distribution Adversarial Attack using Latent Space Poisoning

URL: http://arxiv.org/abs/2012.05027v1
Date: Wed, 9 Dec 2020 13:05:44 GMT
Title: Generating Out of Distribution Adversarial Attack using Latent Space Poisoning
Authors: Ujjwal Upadhyay and Prerana Mukherjee
Abstract summary: We propose a novel mechanism of generating adversarial examples where the actual image is not corrupted. latent space representation is utilized to tamper with the inherent structure of the image. As opposed to gradient-based attacks, the latent space poisoning exploits the inclination of classifiers to model the independent and identical distribution of the training dataset.
Score: 5.1314136039587925
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional adversarial attacks rely upon the perturbations generated by gradients from the network which are generally safeguarded by gradient guided search to provide an adversarial counterpart to the network. In this paper, we propose a novel mechanism of generating adversarial examples where the actual image is not corrupted rather its latent space representation is utilized to tamper with the inherent structure of the image while maintaining the perceptual quality intact and to act as legitimate data samples. As opposed to gradient-based attacks, the latent space poisoning exploits the inclination of classifiers to model the independent and identical distribution of the training dataset and tricks it by producing out of distribution samples. We train a disentangled variational autoencoder (beta-VAE) to model the data in latent space and then we add noise perturbations using a class-conditioned distribution function to the latent space under the constraint that it is misclassified to the target label. Our empirical results on MNIST, SVHN, and CelebA dataset validate that the generated adversarial examples can easily fool robust l_0, l_2, l_inf norm classifiers designed using provably robust defense mechanisms.

Related papers

Crafting Imperceptible On-Manifold Adversarial Attacks for Tabular Data [41.69043684367127]
Adversarial attacks on tabular data present fundamental challenges distinct from image or text domains.<n>Traditional gradient-based methods prioritise $ell_p$-norm constraints, producing imperceptible adversarial examples.<n>We propose a latent space perturbation framework using a mixed-input Variational Autoencoder (VAE) to generate imperceptible adversarial examples.
arXiv Detail & Related papers (2025-07-15T05:34:44Z)
ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models [7.250878248686215]
In this paper, we introduce a novel approach for generating adversarial examples based on diffusion models, named ScoreAdv.<n>Our method is capable of generating an unlimited number of natural adversarial examples and can attack not only classification models but also retrieval models.<n>Our results demonstrate that ScoreAdv achieves state-of-the-art attack success rates and image quality.
arXiv Detail & Related papers (2025-07-08T15:17:24Z)
Adversarial Purification by Consistency-aware Latent Space Optimization on Data Manifolds [48.37843602248313]
Deep neural networks (DNNs) are vulnerable to adversarial samples crafted by adding imperceptible perturbations to clean data, potentially leading to incorrect and dangerous predictions. We propose Consistency Model-based Adversarial Purification (CMAP), which optimize vectors within the latent space of a pre-trained consistency model to generate samples for restoring clean data. CMAP significantly enhances robustness against strong adversarial attacks while preserving high natural accuracy.
arXiv Detail & Related papers (2024-12-11T14:14:02Z)
Certified $\ell_2$ Attribution Robustness via Uniformly Smoothed Attributions [20.487079380753876]
We propose a uniform smoothing technique that augments the vanilla attributions by noises uniformly sampled from a certain space. It is proved that, for all perturbations within the attack region, the cosine similarity between uniformly smoothed attribution of perturbed sample and the unperturbed sample is guaranteed to be lower bounded.
arXiv Detail & Related papers (2024-05-10T09:56:02Z)
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models! [52.0855711767075]
EvoSeed is an evolutionary strategy-based algorithmic framework for generating photo-realistic natural adversarial samples. We employ CMA-ES to optimize the search for an initial seed vector, which, when processed by the Conditional Diffusion Model, results in the natural adversarial sample misclassified by the Model. Experiments show that generated adversarial images are of high image quality, raising concerns about generating harmful content bypassing safety classifiers.
arXiv Detail & Related papers (2024-02-07T09:39:29Z)
A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications [11.39717289910264]
Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. GANs tend to embed near-duplicates of real samples in the latent space. We propose a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models.
arXiv Detail & Related papers (2023-07-06T13:35:48Z)
Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllability [62.105715985563656]
We propose a novel framework dubbed Diffusion-Based Projected Gradient Descent (Diff-PGD) for generating realistic adversarial samples. Our framework can be easily customized for specific tasks such as digital attacks, physical-world attacks, and style-based attacks.
arXiv Detail & Related papers (2023-05-25T21:51:23Z)
Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z)
Self-Conditioned Generative Adversarial Networks for Image Editing [61.50205580051405]
Generative Adversarial Networks (GANs) are susceptible to bias, learned from either the unbalanced data, or through mode collapse. We argue that this bias is responsible not only for fairness concerns, but that it plays a key role in the collapse of latent-traversal editing methods when deviating away from the distribution's core.
arXiv Detail & Related papers (2022-02-08T18:08:24Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA) Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution. Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
Regularization with Latent Space Virtual Adversarial Training [4.874780144224057]
Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods. We propose LVAT, which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effects and thus more effective regularization.
arXiv Detail & Related papers (2020-11-26T08:51:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.