Related papers: Uncertainty, Calibration, and Membership Inference Attacks: An Information-Theoretic Perspective

Uncertainty, Calibration, and Membership Inference Attacks: An Information-Theoretic Perspective

URL: http://arxiv.org/abs/2402.10686v1
Date: Fri, 16 Feb 2024 13:41:18 GMT
Title: Uncertainty, Calibration, and Membership Inference Attacks: An Information-Theoretic Perspective
Authors: Meiyi Zhu, Caili Guo, Chunyan Feng, Osvaldo Simeone
Abstract summary: We analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework. We derive bounds on the advantage of an MIA adversary with the aim of offering insights into the impact of uncertainty and calibration on the effectiveness of MIAs.
Score: 46.08491133624608
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In a membership inference attack (MIA), an attacker exploits the overconfidence exhibited by typical machine learning models to determine whether a specific data point was used to train a target model. In this paper, we analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework that allows the investigation of the impact of the aleatoric uncertainty in the true data generation process, of the epistemic uncertainty caused by a limited training data set, and of the calibration level of the target model. We compare three different settings, in which the attacker receives decreasingly informative feedback from the target model: confidence vector (CV) disclosure, in which the output probability vector is released; true label confidence (TLC) disclosure, in which only the probability assigned to the true label is made available by the model; and decision set (DS) disclosure, in which an adaptive prediction set is produced as in conformal prediction. We derive bounds on the advantage of an MIA adversary with the aim of offering insights into the impact of uncertainty and calibration on the effectiveness of MIAs. Simulation results demonstrate that the derived analytical bounds predict well the effectiveness of MIAs.

Related papers

Credal Prediction based on Relative Likelihood [24.307076055306148]
We propose a theoretically grounded approach to credal prediction based on the statistical notion of relative likelihood.<n>We tackle the problem of approximating credal sets defined in this way by means of suitably modified ensemble learning techniques.
arXiv Detail & Related papers (2025-05-28T13:20:20Z)
Redefining Machine Unlearning: A Conformal Prediction-Motivated Approach [11.609354498110358]
Machine unlearning seeks to remove the influence of specified data from a trained model.<n>In this paper, we find that the data misclassified across UA and MIA still have their ground truth labels included in the prediction set.<n>We propose two novel metrics inspired by conformal prediction that more reliably evaluate forgetting quality.
arXiv Detail & Related papers (2025-01-31T18:58:43Z)
Understanding Uncertainty-based Active Learning Under Model Mismatch [16.361254095103615]
Uncertainty-based Active Learning (UAL) operates by querying the label(s) of pivotal samples from an unlabeled pool selected based on the prediction uncertainty. The efficacy of UAL depends on the model capacity as well as the adopted uncertainty-based acquisition function.
arXiv Detail & Related papers (2024-08-24T23:37:08Z)
Towards a Game-theoretic Understanding of Explanation-based Membership Inference Attacks [8.06071340190569]
Black-box machine learning (ML) models can be exploited to carry out privacy threats such as membership inference attacks (MIA) Existing works have only analyzed MIA in a single "what if" interaction scenario between an adversary and the target ML model. We propose a sound mathematical formulation to prove that such an optimal threshold exists, which can be used to launch MIA.
arXiv Detail & Related papers (2024-04-10T16:14:05Z)
Adversarial Attacks Against Uncertainty Quantification [10.655660123083607]
This work focuses on a different adversarial scenario in which the attacker is still interested in manipulating the uncertainty estimate. In particular, the goal is to undermine the use of machine-learning models when their outputs are consumed by a downstream module or by a human operator.
arXiv Detail & Related papers (2023-09-19T12:54:09Z)
To Predict or to Reject: Causal Effect Estimation with Uncertainty on Networked Data [36.31936265985164]
GraphDKL is the first framework to tackle the violation of positivity assumption when performing causal effect estimation with graphs. With extensive experiments, we demonstrate the superiority of our proposed method in uncertainty-aware causal effect estimation on networked data.
arXiv Detail & Related papers (2023-09-15T05:25:43Z)
Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity. The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z)
Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model. We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set. We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance. Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z)
DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner. We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.