Related papers: Adversarial robustness of VAEs through the lens of local geometry

Adversarial robustness of VAEs through the lens of local geometry

URL: http://arxiv.org/abs/2208.03923v3
Date: Mon, 28 Oct 2024 16:01:39 GMT
Title: Adversarial robustness of VAEs through the lens of local geometry
Authors: Asif Khan, Amos Storkey,
Abstract summary: In an unsupervised attack on variational autoencoders (VAEs), an adversary finds a small perturbation in an input sample that significantly changes its latent space encoding. This paper demonstrates that an optimal way for an adversary to attack VAEs is to exploit a directional bias of a pullback metric tensor.
Score: 1.2228014485474623
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In an unsupervised attack on variational autoencoders (VAEs), an adversary finds a small perturbation in an input sample that significantly changes its latent space encoding, thereby compromising the reconstruction for a fixed decoder. A known reason for such vulnerability is the distortions in the latent space resulting from a mismatch between approximated latent posterior and a prior distribution. Consequently, a slight change in an input sample can move its encoding to a low/zero density region in the latent space resulting in an unconstrained generation. This paper demonstrates that an optimal way for an adversary to attack VAEs is to exploit a directional bias of a stochastic pullback metric tensor induced by the encoder and decoder networks. The pullback metric tensor of an encoder measures the change in infinitesimal latent volume from an input to a latent space. Thus, it can be viewed as a lens to analyse the effect of input perturbations leading to latent space distortions. We propose robustness evaluation scores using the eigenspectrum of a pullback metric tensor. Moreover, we empirically show that the scores correlate with the robustness parameter $\beta$ of the $\beta-$VAE. Since increasing $\beta$ also degrades reconstruction quality, we demonstrate a simple alternative using \textit{mixup} training to fill the empty regions in the latent space, thus improving robustness with improved reconstruction.

Related papers

When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks [2.4923006485141284]
We demonstrate that encoder-side poisoning induces persistent, trigger-free semantic corruption.<n> backdoors act as low-rank, target-centered deformations that amplify local sensitivity, causing distortion to propagate coherently across semantic neighborhoods.<n>Our findings, validated across diffusion and contrastive paradigms, expose the deep structural risks of encoder poisoning and highlight the necessity of geometric audits beyond simple attack success rates.
arXiv Detail & Related papers (2026-02-21T23:48:04Z)
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model [51.83639270669481]
Unsupervised anomaly detection in hyperspectral images (HSI) aims to detect unknown targets from backgrounds. HSI studies are hindered by steep computational costs due to the high-dimensional property of HSI and dense sampling-based training paradigm. We propose an Asymmetrical Consensus State Space Model (ACMamba) to significantly reduce computational costs without compromising accuracy.
arXiv Detail & Related papers (2025-04-16T05:33:42Z)
PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE enhances global feature representation of point cloud masked autoencoders by making them both discriminative and sensitive to transformations.<n>We propose a novel loss that explicitly penalizes invariant collapse, enabling the network to capture richer transformation cues while preserving discriminative representations.
arXiv Detail & Related papers (2024-09-24T07:57:21Z)
Unscented Autoencoder [3.0108936184913295]
Variational Autoencoder (VAE) is a seminal approach in deep generative modeling with latent variables. We apply the Unscented Transform (UT) -- a well-known distribution approximation used in the Unscented Kalman Filter (UKF) from the field of filtering. We derive a novel, deterministic-sampling flavor of the VAE, the Unscented Autoencoder (UAE), trained purely with regularization-like terms on the per-sample posterior.
arXiv Detail & Related papers (2023-06-08T14:53:02Z)
Variational Diffusion Auto-encoder: Latent Space Extraction from Pre-trained Diffusion Models [0.0]
Variational Auto-Encoders (VAEs) face challenges with the quality of generated images, often presenting noticeable blurriness. This issue stems from the unrealistic assumption that approximates the conditional data distribution, $p(textbfx | textbfz)$, as an isotropic Gaussian. We illustrate how one can extract a latent space from a pre-existing diffusion model by optimizing an encoder to maximize the marginal data log-likelihood.
arXiv Detail & Related papers (2023-04-24T14:44:47Z)
Defending Variational Autoencoders from Adversarial Attacks with MCMC [74.36233246536459]
Variational autoencoders (VAEs) are deep generative models used in various domains. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attacks construction, suggest metrics assess the model robustness, and propose a solution.
arXiv Detail & Related papers (2022-03-18T13:25:18Z)
The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss. Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU. The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z)
Certified Defense via Latent Space Randomized Smoothing with Orthogonal Encoders [13.723000245697866]
We investigate the possibility of performing randomized smoothing and establishing the robust certification in the latent space of a network. We use modules, whose Lipschitz property is known for free by design, to propagate the certified radius estimated in the latent space back to the input space. Experiments on CIFAR10 and ImageNet show that our method achieves competitive robustness certified but with a significant improvement of efficiency during the test phase.
arXiv Detail & Related papers (2021-08-01T16:48:43Z)
Low-rank Characteristic Tensor Density Estimation Part II: Compression and Latent Density Estimation [31.631861197477185]
Learning generative probabilistic models is a core problem in machine learning. This paper proposes a joint dimensionality reduction and non-parametric density estimation framework. We demonstrate that the proposed model achieves very promising results on regression tasks, sampling, and anomaly detection.
arXiv Detail & Related papers (2021-06-20T00:38:56Z)
DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA) Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution. Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
Neighbor Embedding Variational Autoencoder [14.08587678497785]
We propose a novel model, neighbor embedding VAE(NE-VAE), which explicitly constraints the encoder to encode inputs close in the input space to be close in the latent space. In our experiments, NE-VAE can produce qualitatively different latent representations with majority of the latent dimensions remained active.
arXiv Detail & Related papers (2021-03-21T09:49:12Z)
Generating Out of Distribution Adversarial Attack using Latent Space Poisoning [5.1314136039587925]
We propose a novel mechanism of generating adversarial examples where the actual image is not corrupted. latent space representation is utilized to tamper with the inherent structure of the image. As opposed to gradient-based attacks, the latent space poisoning exploits the inclination of classifiers to model the independent and identical distribution of the training dataset.
arXiv Detail & Related papers (2020-12-09T13:05:44Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Towards a Theoretical Understanding of the Robustness of Variational Autoencoders [82.68133908421792]
We make inroads into understanding the robustness of Variational Autoencoders (VAEs) to adversarial attacks and other input perturbations. We develop a novel criterion for robustness in probabilistic models: $r$-robustness. We show that VAEs trained using disentangling methods score well under our robustness metrics.
arXiv Detail & Related papers (2020-07-14T21:22:29Z)
Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse. We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.