Lessons Learned: Defending Against Property Inference Attacks
- URL: http://arxiv.org/abs/2205.08821v4
- Date: Mon, 9 Oct 2023 09:46:41 GMT
- Title: Lessons Learned: Defending Against Property Inference Attacks
- Authors: Joshua Stock (1), Jens Wettlaufer, Daniel Demmler (1) and Hannes
Federrath (1) ((1) Universit\"at Hamburg)
- Abstract summary: This work investigates and evaluates multiple defense strategies against property inference attacks (PIAs)
PIAs aim to extract statistical properties of its underlying training data, e.g., reveal the ratio of men and women in a medical training data set.
Experiments with property unlearning show that property unlearning is not able to generalize, i.e., protect against a whole class of PIAs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work investigates and evaluates multiple defense strategies against
property inference attacks (PIAs), a privacy attack against machine learning
models. Given a trained machine learning model, PIAs aim to extract statistical
properties of its underlying training data, e.g., reveal the ratio of men and
women in a medical training data set. While for other privacy attacks like
membership inference, a lot of research on defense mechanisms has been
published, this is the first work focusing on defending against PIAs. With the
primary goal of developing a generic mitigation strategy against white-box
PIAs, we propose the novel approach property unlearning. Extensive experiments
with property unlearning show that while it is very effective when defending
target models against specific adversaries, property unlearning is not able to
generalize, i.e., protect against a whole class of PIAs. To investigate the
reasons behind this limitation, we present the results of experiments with the
explainable AI tool LIME. They show how state-of-the-art property inference
adversaries with the same objective focus on different parts of the target
model. We further elaborate on this with a follow-up experiment, in which we
use the visualization technique t-SNE to exhibit how severely statistical
training data properties are manifested in machine learning models. Based on
this, we develop the conjecture that post-training techniques like property
unlearning might not suffice to provide the desirable generic protection
against PIAs. As an alternative, we investigate the effects of simpler training
data preprocessing methods like adding Gaussian noise to images of a training
data set on the success rate of PIAs. We conclude with a discussion of the
different defense approaches, summarize the lessons learned and provide
directions for future work.
Related papers
- SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Robust Transferable Feature Extractors: Learning to Defend Pre-Trained
Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors.
We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z) - Property inference attack; Graph neural networks; Privacy attacks and
defense; Trustworthy machine learning [5.598383724295497]
Machine learning models are vulnerable to privacy attacks that leak information about the training data.
In this work, we focus on a particular type of privacy attacks named property inference attack (PIA)
We consider Graph Neural Networks (GNNs) as the target model, and distribution of particular groups of nodes and links in the training graph as the target property.
arXiv Detail & Related papers (2022-09-02T14:59:37Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - I Know What You Trained Last Summer: A Survey on Stealing Machine
Learning Models and Defences [0.1031296820074812]
We study model stealing attacks, assessing their performance and exploring corresponding defence techniques in different settings.
We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence based on the goal and available resources.
arXiv Detail & Related papers (2022-06-16T21:16:41Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization [12.079281416410227]
This paper studies defense mechanisms against model inversion (MI) attacks.
MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.
We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
arXiv Detail & Related papers (2020-09-11T06:02:44Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.