Explanation-Guided Diagnosis of Machine Learning Evasion Attacks
- URL: http://arxiv.org/abs/2106.15820v1
- Date: Wed, 30 Jun 2021 05:47:12 GMT
- Title: Explanation-Guided Diagnosis of Machine Learning Evasion Attacks
- Authors: Abderrahmen Amich, Birhanu Eshete
- Abstract summary: We introduce a novel framework that harnesses explainable ML methods to guide high-fidelity assessment of ML evasion attacks.
Our framework enables explanation-guided correlation analysis between pre-evasion perturbations and post-evasion explanations.
- Score: 3.822543555265593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine Learning (ML) models are susceptible to evasion attacks. Evasion
accuracy is typically assessed using aggregate evasion rate, and it is an open
question whether aggregate evasion rate enables feature-level diagnosis on the
effect of adversarial perturbations on evasive predictions. In this paper, we
introduce a novel framework that harnesses explainable ML methods to guide
high-fidelity assessment of ML evasion attacks. Our framework enables
explanation-guided correlation analysis between pre-evasion perturbations and
post-evasion explanations. Towards systematic assessment of ML evasion attacks,
we propose and evaluate a novel suite of model-agnostic metrics for
sample-level and dataset-level correlation analysis. Using malware and image
classifiers, we conduct comprehensive evaluations across diverse model
architectures and complementary feature representations. Our explanation-guided
correlation analysis reveals correlation gaps between adversarial samples and
the corresponding perturbations performed on them. Using a case study on
explanation-guided evasion, we show the broader usage of our methodology for
assessing robustness of ML models.
Related papers
- Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied.
We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables.
To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z) - Empirical Perturbation Analysis of Linear System Solvers from a Data Poisoning Perspective [16.569765598914152]
We investigate how errors in the input data will affect the fitting error and accuracy of the solution from a linear system-solving algorithm under perturbations common in adversarial attacks.
We propose data perturbation through two distinct knowledge levels, developing a poisoning optimization and studying two methods of perturbation: Label-guided Perturbation (LP) and Unconditioning Perturbation (UP)
Under the circumstance that the data is intentionally perturbed -- as is the case with data poisoning -- we seek to understand how different kinds of solvers react to these perturbations, identifying those algorithms most impacted by different types of adversarial attacks.
arXiv Detail & Related papers (2024-10-01T17:14:05Z) - Exploiting the Data Gap: Utilizing Non-ignorable Missingness to Manipulate Model Learning [13.797822374912773]
Adversarial Missingness (AM) attacks are motivated by maliciously engineering non-ignorable missingness mechanisms.
In this work we focus on associational learning in the context of AM attacks.
We formulate the learning of the adversarial missingness mechanism as a bi-level optimization.
arXiv Detail & Related papers (2024-09-06T17:10:28Z) - Revisiting Spurious Correlation in Domain Generalization [12.745076668687748]
We build a structural causal model (SCM) to describe the causality within data generation process.
We further conduct a thorough analysis of the mechanisms underlying spurious correlation.
In this regard, we propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator.
arXiv Detail & Related papers (2024-06-17T13:22:00Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning [8.808963973962278]
We analyse the relationship between privacy vulnerability and dataset properties, such as examples per class and number of classes.
We derive per-example MIA vulnerability in terms of score distributions and statistics computed from shadow models.
arXiv Detail & Related papers (2024-02-07T14:23:01Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - Balancing detectability and performance of attacks on the control
channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs)
This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z) - EG-Booster: Explanation-Guided Booster of ML Evasion Attacks [3.822543555265593]
We present a novel approach called EG-Booster that leverages techniques from explainable ML to guide adversarial example crafting.
EG-Booster is agnostic to model architecture, threat model, and supports diverse distance metrics used previously in the literature.
Our findings suggest that EG-Booster significantly improves evasion rate of state-of-the-art attacks while performing less number of perturbations.
arXiv Detail & Related papers (2021-08-31T15:36:16Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.