Towards Robust Protective Perturbation against DeepFake Face Swapping
- URL: http://arxiv.org/abs/2512.07228v1
- Date: Mon, 08 Dec 2025 07:12:43 GMT
- Title: Towards Robust Protective Perturbation against DeepFake Face Swapping
- Authors: Hengyang Yao, Lin Li, Ke Sun, Jianing Qiu, Huiping Chen,
- Abstract summary: DeepFake face swapping enables highly realistic identity forgeries, posing serious privacy and security risks.<n>A common defence embeds invisible perturbations into images, but these are fragile and often destroyed by basic transformations such as compression or resizing.<n>In this paper, we first conduct a systematic analysis of 30 transformations across six categories and show that protection robustness is highly sensitive to the choice of training transformations.<n>Motivated by this, we propose Expectation Over Learned distribution of Transformation, the framework to treat transformation distribution as a learnable component rather than a fixed design choice.
- Score: 9.722447815149318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: DeepFake face swapping enables highly realistic identity forgeries, posing serious privacy and security risks. A common defence embeds invisible perturbations into images, but these are fragile and often destroyed by basic transformations such as compression or resizing. In this paper, we first conduct a systematic analysis of 30 transformations across six categories and show that protection robustness is highly sensitive to the choice of training transformations, making the standard Expectation over Transformation (EOT) with uniform sampling fundamentally suboptimal. Motivated by this, we propose Expectation Over Learned distribution of Transformation (EOLT), the framework to treat transformation distribution as a learnable component rather than a fixed design choice. Specifically, EOLT employs a policy network that learns to automatically prioritize critical transformations and adaptively generate instance-specific perturbations via reinforcement learning, enabling explicit modeling of defensive bottlenecks while maintaining broad transferability. Extensive experiments demonstrate that our method achieves substantial improvements over state-of-the-art approaches, with 26% higher average robustness and up to 30% gains on challenging transformation categories.
Related papers
- A Constrained Optimization Perspective of Unrolled Transformers [77.12297732942095]
We introduce a constrained optimization framework for training transformers that behave like optimization descent algorithms.<n>We observe constrained transformers achieve stronger to perturbations robustness and maintain higher out-of-distribution generalization.
arXiv Detail & Related papers (2026-01-24T02:12:39Z) - Deep Leakage with Generative Flow Matching Denoiser [54.05993847488204]
We introduce a new deep leakage (DL) attack that integrates a generative Flow Matching (FM) prior into the reconstruction process.<n>Our approach consistently outperforms state-of-the-art attacks across pixel-level, perceptual, and feature-based similarity metrics.
arXiv Detail & Related papers (2026-01-21T14:51:01Z) - Proxy Robustness in Vision Language Models is Effortlessly Transferable [13.390016978827163]
A pivotal technique for improving the defense of deep models, adversarial robustness transfer via distillation has demonstrated remarkable success in conventional image classification tasks.<n>We bridge this gap by revealing an interesting phenomenon: vanilla CLIP (without adversarial training) exhibits intrinsic defensive capabilities against adversarial examples.<n>We formally define this as proxy adversarial robustness, and naturally propose a Heterogeneous Proxy Transfer framework.
arXiv Detail & Related papers (2026-01-19T09:23:11Z) - Enhancing Variational Autoencoders with Smooth Robust Latent Encoding [54.74721202894622]
Variational Autoencoders (VAEs) have played a key role in scaling up diffusion-based generative models.<n>We introduce Smooth Robust Latent VAE, a novel adversarial training framework that boosts both generation quality and robustness.<n>Experiments show that SRL-VAE improves both generation quality, in image reconstruction and text-guided image editing, and robustness, against Nightshade attacks and image editing attacks.
arXiv Detail & Related papers (2025-04-24T03:17:57Z) - Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.<n>Current approaches typically address this issue through online sampling from the target policy.<n>We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z) - Enhancing Adversarial Transferability via Component-Wise Transformation [28.209214055953844]
This paper proposes a novel input-based attack method, termed Component-Wise Transformation (CWT)<n>CWT applies selective rotation to individual image blocks, ensuring that each transformed image highlights different target regions.<n>Experiments on the standard ImageNet dataset show that CWT consistently outperforms state-of-the-art methods in both attack success rates and stability.
arXiv Detail & Related papers (2025-01-21T05:41:09Z) - Semantic-Aligned Adversarial Evolution Triangle for High-Transferability Vision-Language Attack [51.16384207202798]
Vision-language pre-training models are vulnerable to multimodal adversarial examples (AEs)
Previous approaches augment image-text pairs to enhance diversity within the adversarial example generation process.
We propose sampling from adversarial evolution triangles composed of clean, historical, and current adversarial examples to enhance adversarial diversity.
arXiv Detail & Related papers (2024-11-04T23:07:51Z) - Enhancing Transferability of Targeted Adversarial Examples: A Self-Universal Perspective [13.557972227440832]
Transfer-based targeted adversarial attacks against black-box deep neural networks (DNNs) have been proven to be significantly more challenging than untargeted ones.
The impressive transferability of current SOTA, the generative methods, comes at the cost of requiring massive amounts of additional data and time-consuming training for each targeted label.
We offer a self-universal perspective that unveils the great yet underexplored potential of input transformations in pursuing this goal.
arXiv Detail & Related papers (2024-07-22T14:51:28Z) - Transform-Dependent Adversarial Attacks [15.374381635334897]
We introduce transform-dependent adversarial attacks on deep networks.<n>Our perturbations exhibit metamorphic properties, enabling diverse adversarial effects as a function of transformation parameters.<n>We show that transform-dependent perturbations achieve high targeted attack success rates, outperforming state-of-the-art transfer attacks by 17-31% in blackbox scenarios.
arXiv Detail & Related papers (2024-06-12T17:31:36Z) - Stabilizing Transformer Training by Preventing Attention Entropy
Collapse [56.45313891694746]
We investigate the training dynamics of Transformers by examining the evolution of the attention layers.
We show that $sigma$Reparam successfully prevents entropy collapse in the attention layers, promoting more stable training.
We conduct experiments with $sigma$Reparam on image classification, image self-supervised learning, machine translation, speech recognition, and language modeling tasks.
arXiv Detail & Related papers (2023-03-11T03:30:47Z) - Exploring Transferable and Robust Adversarial Perturbation Generation
from the Perspective of Network Hierarchy [52.153866313879924]
The transferability and robustness of adversarial examples are two practical yet important properties for black-box adversarial attacks.
We propose a transferable and robust adversarial generation (TRAP) method.
Our TRAP achieves impressive transferability and high robustness against certain interferences.
arXiv Detail & Related papers (2021-08-16T11:52:41Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z) - Where Does the Robustness Come from? A Study of the Transformation-based
Ensemble Defence [12.973226757056462]
It is not clear whether the robustness improvement is a result of transformation or ensemble.
We conduct experiments to show that 1) the transferability of adversarial examples exists among the models trained on data records after different reversible transformations; 2) the robustness gained through transformation-based ensemble is limited; and 3) this limited robustness is mainly from the irreversible transformations rather than the ensemble of a number of models.
arXiv Detail & Related papers (2020-09-28T02:55:56Z) - TSS: Transformation-Specific Smoothing for Robustness Certification [37.87602431929278]
Motivated adversaries can mislead machine learning systems by perturbing test data using semantic transformations.
We provide TSS -- a unified framework for certifying ML robustness against general adversarial semantic transformations.
We show TSS is the first approach that achieves nontrivial certified robustness on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2020-02-27T19:19:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.