Related papers: Double Backpropagation for Training Autoencoders against Adversarial Attack

Double Backpropagation for Training Autoencoders against Adversarial Attack

URL: http://arxiv.org/abs/2003.01895v1
Date: Wed, 4 Mar 2020 05:12:27 GMT
Title: Double Backpropagation for Training Autoencoders against Adversarial Attack
Authors: Chengjin Sun, Sizhe Chen, and Xiaolin Huang
Abstract summary: This paper focuses on the adversarial attack on autoencoders. We propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW.
Score: 15.264115499966413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning, as widely known, is vulnerable to adversarial samples. This paper focuses on the adversarial attack on autoencoders. Safety of the autoencoders (AEs) is important because they are widely used as a compression scheme for data storage and transmission, however, the current autoencoders are easily attacked, i.e., one can slightly modify an input but has totally different codes. The vulnerability is rooted the sensitivity of the autoencoders and to enhance the robustness, we propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW. We restrict the gradient from the reconstruction image to the original one so that the autoencoder is not sensitive to trivial perturbation produced by the adversarial attack. After smoothing the gradient by DBP, we further smooth the label by Gaussian Mixture Model (GMM), aiming for accurate and robust classification. We demonstrate in MNIST, CelebA, SVHN that our method leads to a robust autoencoder resistant to attack and a robust classifier able for image transition and immune to adversarial attack if combined with GMM.

Related papers

Fooling the Decoder: An Adversarial Attack on Quantum Error Correction [49.48516314472825]
In this work, we target a basic RL surface code decoder (DeepQ) to create the first adversarial attack on quantum error correction. We demonstrate an attack that reduces the logical qubit lifetime in memory experiments by up to five orders of magnitude. This attack highlights the susceptibility of machine learning-based QEC and underscores the importance of further research into robust QEC methods.
arXiv Detail & Related papers (2025-04-28T10:10:05Z)
Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior [118.92747171905727]
This paper introduces a novel frequency-based trigger injection model for launching backdoor attacks with multiple triggers on learned image compression models. We design attack objectives tailored to diverse scenarios, including: 1) degrading compression quality in terms of bit-rate and reconstruction accuracy; 2) targeting task-driven measures like face recognition and semantic segmentation. Experiments show that our trigger injection models, combined with minor modifications to encoder parameters, successfully inject multiple backdoors and their triggers into a single compression model.
arXiv Detail & Related papers (2024-12-02T15:58:40Z)
DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders [6.698677477097004]
Self-supervised learning (SSL) is pervasively exploited in training high-quality upstream encoders with a large amount of unlabeled data. backdoor attacks merely via polluting a small portion of training data. We propose a novel detection mechanism, DeDe, which detects the activation of the backdoor mapping with the cooccurrence of victim encoder and trigger inputs.
arXiv Detail & Related papers (2024-11-25T07:26:22Z)
Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z)
On the Adversarial Robustness of Generative Autoencoders in the Latent Space [22.99128324197949]
We provide the first study on the adversarial robustness of generative autoencoders in the latent space. Specifically, we empirically demonstrate the latent vulnerability of popular generative autoencoders through attacks in the latent space. We identify a potential trade-off between the adversarial robustness and the degree of the disentanglement of the latent codes.
arXiv Detail & Related papers (2023-07-05T10:53:49Z)
Is Semantic Communications Secure? A Tale of Multi-Domain Adversarial Attacks [70.51799606279883]
We introduce test-time adversarial attacks on deep neural networks (DNNs) for semantic communications. We show that it is possible to change the semantics of the transferred information even when the reconstruction loss remains low.
arXiv Detail & Related papers (2022-12-20T17:13:22Z)
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning. In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data. We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z)
Defending Variational Autoencoders from Adversarial Attacks with MCMC [74.36233246536459]
Variational autoencoders (VAEs) are deep generative models used in various domains. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attacks construction, suggest metrics assess the model robustness, and propose a solution.
arXiv Detail & Related papers (2022-03-18T13:25:18Z)
Self-Supervised Adversarial Example Detection by Disentangled Representation [16.98476232162835]
We train an autoencoder, assisted by a discriminator network, over both correctly paired class/semantic features and incorrectly paired class/semantic features to reconstruct benign and counterexamples. This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder. Compared with the state-of-the-art self-supervised detection methods, our method exhibits better performance in various measurements.
arXiv Detail & Related papers (2021-05-08T12:48:18Z)
Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks [80.73580820014242]
We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack) We examine the influence of model modifications on the robustness of VAEs and suggest metrics to quantify it.
arXiv Detail & Related papers (2021-03-10T14:23:20Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Revisiting Role of Autoencoders in Adversarial Settings [32.22707594954084]
This paper presents the inherent property of adversarial robustness in the autoencoders. We believe that our discovery of the adversarial robustness of the autoencoders can provide clues to the future research and applications for adversarial defense.
arXiv Detail & Related papers (2020-05-21T16:01:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.