Protecting Classifiers From Attacks. A Bayesian Approach
- URL: http://arxiv.org/abs/2004.08705v1
- Date: Sat, 18 Apr 2020 21:21:56 GMT
- Title: Protecting Classifiers From Attacks. A Bayesian Approach
- Authors: Victor Gallego, Roi Naveiro, Alberto Redondo, David Rios Insua,
Fabrizio Ruggeri
- Abstract summary: We provide an alternative Bayesian framework that accounts for the lack of precise knowledge about the attacker's behavior using adversarial risk analysis.
We propose a sampling procedure based on approximate Bayesian computation, in which we simulate the attacker's problem taking into account our uncertainty about his elements.
For large scale problems, we propose an alternative, scalable approach that could be used when dealing with differentiable classifiers.
- Score: 0.9449650062296823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classification problems in security settings are usually modeled as
confrontations in which an adversary tries to fool a classifier manipulating
the covariates of instances to obtain a benefit. Most approaches to such
problems have focused on game-theoretic ideas with strong underlying common
knowledge assumptions, which are not realistic in the security realm. We
provide an alternative Bayesian framework that accounts for the lack of precise
knowledge about the attacker's behavior using adversarial risk analysis. A key
ingredient required by our framework is the ability to sample from the
distribution of originating instances given the possibly attacked observed one.
We propose a sampling procedure based on approximate Bayesian computation, in
which we simulate the attacker's problem taking into account our uncertainty
about his elements. For large scale problems, we propose an alternative,
scalable approach that could be used when dealing with differentiable
classifiers. Within it, we move the computational load to the training phase,
simulating attacks from an adversary, adapting the framework to obtain a
classifier robustified against attacks.
Related papers
- Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied.
We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables.
To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Towards Fair Classification against Poisoning Attacks [52.57443558122475]
We study the poisoning scenario where the attacker can insert a small fraction of samples into training data.
We propose a general and theoretically guaranteed framework which accommodates traditional defense methods to fair classification against poisoning attacks.
arXiv Detail & Related papers (2022-10-18T00:49:58Z) - Defending Substitution-Based Profile Pollution Attacks on Sequential
Recommenders [8.828396559882954]
We propose a substitution-based adversarial attack algorithm, which modifies the input sequence by selecting certain vulnerable elements and substituting them with adversarial items.
We also design an efficient adversarial defense method called Dirichlet neighborhood sampling.
In particular, we represent selected items with one-hot encodings and perform gradient ascent on the encodings to search for the worst case linear combination of item embeddings in training.
arXiv Detail & Related papers (2022-07-19T00:19:13Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Localized Uncertainty Attacks [9.36341602283533]
We present localized uncertainty attacks against deep learning models.
We create adversarial examples by perturbing only regions in the inputs where a classifier is uncertain.
Unlike $ell_p$ ball or functional attacks which perturb inputs indiscriminately, our targeted changes can be less perceptible.
arXiv Detail & Related papers (2021-06-17T03:07:22Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Adversarial Feature Selection against Evasion Attacks [17.98312950660093]
We propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks.
We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples.
arXiv Detail & Related papers (2020-05-25T15:05:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.