Selecting Models based on the Risk of Damage Caused by Adversarial
Attacks
- URL: http://arxiv.org/abs/2301.12151v1
- Date: Sat, 28 Jan 2023 10:24:38 GMT
- Title: Selecting Models based on the Risk of Damage Caused by Adversarial
Attacks
- Authors: Jona Klemenc, Holger Trittenbach
- Abstract summary: Regulation, legal liabilities, and societal concerns challenge the adoption of AI in safety and security-critical applications.
One of the key concerns is that adversaries can cause harm by manipulating model predictions without being detected.
We propose a method to model and statistically estimate the probability of damage arising from adversarial attacks.
- Score: 2.969705152497174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regulation, legal liabilities, and societal concerns challenge the adoption
of AI in safety and security-critical applications. One of the key concerns is
that adversaries can cause harm by manipulating model predictions without being
detected. Regulation hence demands an assessment of the risk of damage caused
by adversaries. Yet, there is no method to translate this high-level demand
into actionable metrics that quantify the risk of damage.
In this article, we propose a method to model and statistically estimate the
probability of damage arising from adversarial attacks. We show that our
proposed estimator is statistically consistent and unbiased. In experiments, we
demonstrate that the estimation results of our method have a clear and
actionable interpretation and outperform conventional metrics. We then show how
operators can use the estimation results to reliably select the model with the
lowest risk.
Related papers
- Data-driven decision-making under uncertainty with entropic risk measure [5.407319151576265]
The entropic risk measure is widely used in high-stakes decision making to account for tail risks associated with an uncertain loss.
To debias the empirical entropic risk estimator, we propose a strongly consistent bootstrapping procedure.
We show that cross validation methods can result in significantly higher out-of-sample risk for the insurer if the bias in validation performance is not corrected for.
arXiv Detail & Related papers (2024-09-30T04:02:52Z) - Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework [77.45983464131977]
We focus on how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications.
Our research identifies two critical latent factors affecting RAG's confidence in its predictions.
We develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers.
arXiv Detail & Related papers (2024-09-24T14:52:14Z) - Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - Predictive Uncertainty Quantification via Risk Decompositions for Strictly Proper Scoring Rules [7.0549244915538765]
Uncertainty in predictive modeling often relies on ad hoc methods.
This paper introduces a theoretical approach to understanding uncertainty through statistical risks.
We show how to split pointwise risk into Bayes risk and excess risk.
arXiv Detail & Related papers (2024-02-16T14:40:22Z) - On the Impact of Uncertainty and Calibration on Likelihood-Ratio Membership Inference Attacks [42.18575921329484]
We analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework.
We derive bounds on the advantage of an MIA adversary with the aim of offering insights into the impact of uncertainty and calibration on the effectiveness of MIAs.
arXiv Detail & Related papers (2024-02-16T13:41:18Z) - Adversarial Attacks Against Uncertainty Quantification [10.655660123083607]
This work focuses on a different adversarial scenario in which the attacker is still interested in manipulating the uncertainty estimate.
In particular, the goal is to undermine the use of machine-learning models when their outputs are consumed by a downstream module or by a human operator.
arXiv Detail & Related papers (2023-09-19T12:54:09Z) - Safe Deployment for Counterfactual Learning to Rank with Exposure-Based
Risk Minimization [63.93275508300137]
We introduce a novel risk-aware Counterfactual Learning To Rank method with theoretical guarantees for safe deployment.
Our experimental results demonstrate the efficacy of our proposed method, which is effective at avoiding initial periods of bad performance when little data is available.
arXiv Detail & Related papers (2023-04-26T15:54:23Z) - Balancing detectability and performance of attacks on the control
channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs)
This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Identifying Causal-Effect Inference Failure with Uncertainty-Aware
Models [41.53326337725239]
We introduce a practical approach for integrating uncertainty estimation into a class of state-of-the-art neural network methods.
We show that our methods enable us to deal gracefully with situations of "no-overlap", common in high-dimensional data.
We show that correctly modeling uncertainty can keep us from giving overconfident and potentially harmful recommendations.
arXiv Detail & Related papers (2020-07-01T00:37:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.