A Unified Framework for Adversarial Attack and Defense in Constrained
Feature Space
- URL: http://arxiv.org/abs/2112.01156v1
- Date: Thu, 2 Dec 2021 12:05:27 GMT
- Title: A Unified Framework for Adversarial Attack and Defense in Constrained
Feature Space
- Authors: Thibault Simonetto, Salijona Dyrmishi, Salah Ghamizi, Maxime Cordy,
Yves Le Traon
- Abstract summary: We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints.
Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that research can exploit.
- Score: 13.096022606256973
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The generation of feasible adversarial examples is necessary for properly
assessing models that work on constrained feature space. However, it remains a
challenging task to enforce constraints into attacks that were designed for
computer vision. We propose a unified framework to generate feasible
adversarial examples that satisfy given domain constraints. Our framework
supports the use cases reported in the literature and can handle both linear
and non-linear constraints. We instantiate our framework into two algorithms: a
gradient-based attack that introduces constraints in the loss function to
maximize, and a multi-objective search algorithm that aims for
misclassification, perturbation minimization, and constraint satisfaction. We
show that our approach is effective on two datasets from different domains,
with a success rate of up to 100%, where state-of-the-art attacks fail to
generate a single feasible example. In addition to adversarial retraining, we
propose to introduce engineered non-convex constraints to improve model
adversarial robustness. We demonstrate that this new defense is as effective as
adversarial retraining. Our framework forms the starting point for research on
constrained adversarial attacks and provides relevant baselines and datasets
that future research can exploit.
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Pragmatic Fairness: Developing Policies with Outcome Disparity Control [15.618754942472822]
We introduce a causal framework for designing optimal policies that satisfy fairness constraints.
We propose two different fairness constraints: a moderation breaking constraint and an equal benefit constraint.
arXiv Detail & Related papers (2023-01-28T19:25:56Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - On the Robustness of Domain Constraints [0.4194295877935867]
It is unclear if adversarial examples represent realistic inputs in the modeled domains.
In this paper, we explore how domain constraints limit adversarial capabilities.
We show how the learned constraints can be integrated into the adversarial crafting process.
arXiv Detail & Related papers (2021-05-18T15:49:55Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - Opportunities and Challenges in Deep Learning Adversarial Robustness: A
Survey [1.8782750537161614]
This paper studies strategies to implement adversary robustly trained algorithms towards guaranteeing safety in machine learning algorithms.
We provide a taxonomy to classify adversarial attacks and defenses, formulate the Robust Optimization problem in a min-max setting, and divide it into 3 subcategories, namely: Adversarial (re)Training, Regularization Approach, and Certified Defenses.
arXiv Detail & Related papers (2020-07-01T21:00:32Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.