An Extension of Fano's Inequality for Characterizing Model
Susceptibility to Membership Inference Attacks
- URL: http://arxiv.org/abs/2009.08097v1
- Date: Thu, 17 Sep 2020 06:37:15 GMT
- Title: An Extension of Fano's Inequality for Characterizing Model
Susceptibility to Membership Inference Attacks
- Authors: Sumit Kumar Jha, Susmit Jha, Rickard Ewetz, Sunny Raj, Alvaro
Velasquez, Laura L. Pullum, Ananthram Swami
- Abstract summary: We show that the probability of success for a membership inference attack on a deep neural network can be bounded using the mutual information between its inputs and its activations.
This enables the use of mutual information to measure the susceptibility of a DNN model to membership inference attacks.
- Score: 28.366183028100057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have been shown to be vulnerable to membership inference
attacks wherein the attacker aims to detect whether specific input data were
used to train the model. These attacks can potentially leak private or
proprietary data. We present a new extension of Fano's inequality and employ it
to theoretically establish that the probability of success for a membership
inference attack on a deep neural network can be bounded using the mutual
information between its inputs and its activations. This enables the use of
mutual information to measure the susceptibility of a DNN model to membership
inference attacks. In our empirical evaluation, we show that the correlation
between the mutual information and the susceptibility of the DNN model to
membership inference attacks is 0.966, 0.996, and 0.955 for CIFAR-10, SVHN and
GTSRB models, respectively.
Related papers
- FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks [62.897993591443594]
FullCert is the first end-to-end certifier with sound, deterministic bounds.
We experimentally demonstrate FullCert's feasibility on two datasets.
arXiv Detail & Related papers (2024-06-17T13:23:52Z) - GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation [4.332441337407564]
We explore a connection between the susceptibility to membership inference attacks and the vulnerability to distillation-based functionality stealing attacks.
We propose GLiRA, a distillation-guided approach to membership inference attack on the black-box neural network.
We evaluate the proposed method across multiple image classification datasets and models and demonstrate that likelihood ratio attacks when guided by the knowledge distillation, outperform the current state-of-the-art membership inference attacks in the black-box setting.
arXiv Detail & Related papers (2024-05-13T08:52:04Z) - FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Bayesian Learning with Information Gain Provably Bounds Risk for a
Robust Adversarial Defense [27.545466364906773]
We present a new algorithm to learn a deep neural network model robust against adversarial attacks.
Our model demonstrate significantly improved robustness--up to 20%--compared with adversarial training and Adv-BNN under PGD attacks.
arXiv Detail & Related papers (2022-12-05T03:26:08Z) - Purifier: Defending Data Inference Attacks via Transforming Confidence
Scores [27.330482508047428]
We propose a method, namely PURIFIER, to defend against membership inference attacks.
Experiments show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency.
PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks.
arXiv Detail & Related papers (2022-12-01T16:09:50Z) - Generative Models with Information-Theoretic Protection Against
Membership Inference Attacks [6.840474688871695]
Deep generative models, such as Generative Adversarial Networks (GANs), synthesize diverse high-fidelity data samples.
GANs may disclose private information from the data they are trained on, making them susceptible to adversarial attacks.
We propose an information theoretically motivated regularization term that prevents the generative model from overfitting to training data and encourages generalizability.
arXiv Detail & Related papers (2022-05-31T19:29:55Z) - Formalizing and Estimating Distribution Inference Risks [11.650381752104298]
We propose a formal and general definition of property inference attacks.
Our results show that inexpensive attacks are as effective as expensive meta-classifier attacks.
We extend the state-of-the-art property inference attack to work on convolutional neural networks.
arXiv Detail & Related papers (2021-09-13T14:54:39Z) - Adversarial Robustness through the Lens of Causality [105.51753064807014]
adversarial vulnerability of deep neural networks has attracted significant attention in machine learning.
We propose to incorporate causality into mitigating adversarial vulnerability.
Our method can be seen as the first attempt to leverage causality for mitigating adversarial vulnerability.
arXiv Detail & Related papers (2021-06-11T06:55:02Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.