The Good, the Bad and the Ugly: Watermarks, Transferable Attacks and Adversarial Defenses
- URL: http://arxiv.org/abs/2410.08864v1
- Date: Fri, 11 Oct 2024 14:44:05 GMT
- Title: The Good, the Bad and the Ugly: Watermarks, Transferable Attacks and Adversarial Defenses
- Authors: Grzegorz GÅ‚uch, Berkant Turan, Sai Ganesh Nagarajan, Sebastian Pokutta,
- Abstract summary: We formalize and extend existing definitions of backdoor-based watermarks and adversarial defenses as interactive protocols between two players.
For almost every discriminative learning task, at least one of the two -- a watermark or an adversarial defense -- exists.
We show that any task that satisfies our notion of a transferable attack implies a cryptographic primitive.
- Score: 21.975560789792073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We formalize and extend existing definitions of backdoor-based watermarks and adversarial defenses as interactive protocols between two players. The existence of these schemes is inherently tied to the learning tasks for which they are designed. Our main result shows that for almost every discriminative learning task, at least one of the two -- a watermark or an adversarial defense -- exists. The term "almost every" indicates that we also identify a third, counterintuitive but necessary option, i.e., a scheme we call a transferable attack. By transferable attack, we refer to an efficient algorithm computing queries that look indistinguishable from the data distribution and fool all efficient defenders. To this end, we prove the necessity of a transferable attack via a construction that uses a cryptographic tool called homomorphic encryption. Furthermore, we show that any task that satisfies our notion of a transferable attack implies a cryptographic primitive, thus requiring the underlying task to be computationally complex. These two facts imply an "equivalence" between the existence of transferable attacks and cryptography. Finally, we show that the class of tasks of bounded VC-dimension has an adversarial defense, and a subclass of them has a watermark.
Related papers
- Backdoor defense, learnability and obfuscation [8.905450847393132]
We introduce a formal notion of defendability against backdoors using a game between an attacker and a defender.
Our definition is simple and does not explicitly mention learning, yet we demonstrate that it is closely connected to learnability.
arXiv Detail & Related papers (2024-09-04T21:05:42Z) - Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms.
Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z) - Reverse engineering adversarial attacks with fingerprints from
adversarial examples [0.0]
Adversarial examples are typically generated by an attack algorithm that optimize a perturbation added to a benign input.
We take a "fight fire with fire" approach, training deep neural networks to classify these perturbations.
We achieve an accuracy of 99.4% with a ResNet50 model trained on the perturbations.
arXiv Detail & Related papers (2023-01-31T18:59:37Z) - Zero-Query Transfer Attacks on Context-Aware Object Detectors [95.18656036716972]
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results.
A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check.
We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check.
arXiv Detail & Related papers (2022-03-29T04:33:06Z) - Excess Capacity and Backdoor Poisoning [11.383869751239166]
A backdoor data poisoning attack is an adversarial attack wherein the attacker injects several watermarked, mislabeled training examples into a training set.
We present a formal theoretical framework within which one can discuss backdoor data poisoning attacks for classification problems.
arXiv Detail & Related papers (2021-09-02T03:04:38Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - Attack Agnostic Adversarial Defense via Visual Imperceptible Bound [70.72413095698961]
This research aims to design a defense model that is robust within a certain bound against both seen and unseen adversarial attacks.
The proposed defense model is evaluated on the MNIST, CIFAR-10, and Tiny ImageNet databases.
The proposed algorithm is attack agnostic, i.e. it does not require any knowledge of the attack algorithm.
arXiv Detail & Related papers (2020-10-25T23:14:26Z) - Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class.
We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.