Related papers: A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

URL: http://arxiv.org/abs/2112.01156v1
Date: Thu, 2 Dec 2021 12:05:27 GMT
Title: A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space
Authors: Thibault Simonetto, Salijona Dyrmishi, Salah Ghamizi, Maxime Cordy, Yves Le Traon
Abstract summary: We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that research can exploit.
Score: 13.096022606256973
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The generation of feasible adversarial examples is necessary for properly assessing models that work on constrained feature space. However, it remains a challenging task to enforce constraints into attacks that were designed for computer vision. We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework supports the use cases reported in the literature and can handle both linear and non-linear constraints. We instantiate our framework into two algorithms: a gradient-based attack that introduces constraints in the loss function to maximize, and a multi-objective search algorithm that aims for misclassification, perturbation minimization, and constraint satisfaction. We show that our approach is effective on two datasets from different domains, with a success rate of up to 100%, where state-of-the-art attacks fail to generate a single feasible example. In addition to adversarial retraining, we propose to introduce engineered non-convex constraints to improve model adversarial robustness. We demonstrate that this new defense is as effective as adversarial retraining. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that future research can exploit.

Related papers

MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z)
Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations. We transform the multi-granular attack into a sequential decision-making process. Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z)
Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme. Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
Pragmatic Fairness: Developing Policies with Outcome Disparity Control [15.618754942472822]
We introduce a causal framework for designing optimal policies that satisfy fairness constraints. We propose two different fairness constraints: a moderation breaking constraint and an equal benefit constraint.
arXiv Detail & Related papers (2023-01-28T19:25:56Z)
Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify. We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model. We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z)
On the Robustness of Domain Constraints [0.4194295877935867]
It is unclear if adversarial examples represent realistic inputs in the modeled domains. In this paper, we explore how domain constraints limit adversarial capabilities. We show how the learned constraints can be integrated into the adversarial crafting process.
arXiv Detail & Related papers (2021-05-18T15:49:55Z)
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey [1.8782750537161614]
This paper studies strategies to implement adversary robustly trained algorithms towards guaranteeing safety in machine learning algorithms. We provide a taxonomy to classify adversarial attacks and defenses, formulate the Robust Optimization problem in a min-max setting, and divide it into 3 subcategories, namely: Adversarial (re)Training, Regularization Approach, and Certified Defenses.
arXiv Detail & Related papers (2020-07-01T21:00:32Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.