Mitigating Membership Inference Attacks by Self-Distillation Through a
Novel Ensemble Architecture
- URL: http://arxiv.org/abs/2110.08324v1
- Date: Fri, 15 Oct 2021 19:22:52 GMT
- Title: Mitigating Membership Inference Attacks by Self-Distillation Through a
Novel Ensemble Architecture
- Authors: Xinyu Tang, Saeed Mahloujifar, Liwei Song, Virat Shejwalkar, Milad
Nasr, Amir Houmansadr, Prateek Mittal
- Abstract summary: Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models.
We propose a new framework to train privacy-preserving models that induce similar behavior on member and non-member inputs.
We show that SELENA presents a superior trade-off between membership privacy and utility compared to the state of the art.
- Score: 44.2351146468898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Membership inference attacks are a key measure to evaluate privacy leakage in
machine learning (ML) models. These attacks aim to distinguish training members
from non-members by exploiting differential behavior of the models on member
and non-member inputs. The goal of this work is to train ML models that have
high membership privacy while largely preserving their utility; we therefore
aim for an empirical membership privacy guarantee as opposed to the provable
privacy guarantees provided by techniques like differential privacy, as such
techniques are shown to deteriorate model utility. Specifically, we propose a
new framework to train privacy-preserving models that induces similar behavior
on member and non-member inputs to mitigate membership inference attacks. Our
framework, called SELENA, has two major components. The first component and the
core of our defense is a novel ensemble architecture for training. This
architecture, which we call Split-AI, splits the training data into random
subsets, and trains a model on each subset of the data. We use an adaptive
inference strategy at test time: our ensemble architecture aggregates the
outputs of only those models that did not contain the input sample in their
training data. We prove that our Split-AI architecture defends against a large
family of membership inference attacks, however, it is susceptible to new
adaptive attacks. Therefore, we use a second component in our framework called
Self-Distillation to protect against such stronger attacks. The
Self-Distillation component (self-)distills the training dataset through our
Split-AI ensemble, without using any external public datasets. Through
extensive experiments on major benchmark datasets we show that SELENA presents
a superior trade-off between membership privacy and utility compared to the
state of the art.
Related papers
- Client-specific Property Inference against Secure Aggregation in
Federated Learning [52.8564467292226]
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants.
Many attacks have shown that it is still possible to infer sensitive information such as membership, property, or outright reconstruction of participant data.
We show that simple linear models can effectively capture client-specific properties only from the aggregated model updates.
arXiv Detail & Related papers (2023-03-07T14:11:01Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Lessons Learned: Defending Against Property Inference Attacks [0.0]
This work investigates and evaluates multiple defense strategies against property inference attacks (PIAs)
PIAs aim to extract statistical properties of its underlying training data, e.g., reveal the ratio of men and women in a medical training data set.
Experiments with property unlearning show that property unlearning is not able to generalize, i.e., protect against a whole class of PIAs.
arXiv Detail & Related papers (2022-05-18T09:38:37Z) - Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets [53.866927712193416]
We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak private details belonging to other parties.
Our attacks are effective across membership inference, attribute inference, and data extraction.
Our results cast doubts on the relevance of cryptographic privacy guarantees in multiparty protocols for machine learning.
arXiv Detail & Related papers (2022-03-31T18:06:28Z) - Adversarial Representation Sharing: A Quantitative and Secure
Collaborative Learning Framework [3.759936323189418]
We find representation learning has unique advantages in collaborative learning due to the lower communication overhead and task-independency.
We present ARS, a collaborative learning framework wherein users share representations of data to train models.
We demonstrate that our mechanism is effective against model inversion attacks, and achieves a balance between privacy and utility.
arXiv Detail & Related papers (2022-03-27T13:29:15Z) - Privacy Analysis of Deep Learning in the Wild: Membership Inference
Attacks against Transfer Learning [27.494206948563885]
We present the first systematic evaluation of membership inference attacks against transfer learning models.
Experiments on four real-world image datasets show that membership inference can achieve effective performance.
Our results shed light on the severity of membership risks stemming from machine learning models in practice.
arXiv Detail & Related papers (2020-09-10T14:14:22Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.