Are Your Sensitive Attributes Private? Novel Model Inversion Attribute
Inference Attacks on Classification Models
- URL: http://arxiv.org/abs/2201.09370v1
- Date: Sun, 23 Jan 2022 21:27:20 GMT
- Title: Are Your Sensitive Attributes Private? Novel Model Inversion Attribute
Inference Attacks on Classification Models
- Authors: Shagufta Mehnaz, Sayanton V. Dibbo, Ehsanul Kabir, Ninghui Li, Elisa
Bertino
- Abstract summary: We focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data.
We devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art.
We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary.
- Score: 22.569705869469814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Increasing use of machine learning (ML) technologies in privacy-sensitive
domains such as medical diagnoses, lifestyle predictions, and business
decisions highlights the need to better understand if these ML technologies are
introducing leakage of sensitive and proprietary training data. In this paper,
we focus on model inversion attacks where the adversary knows non-sensitive
attributes about records in the training data and aims to infer the value of a
sensitive attribute unknown to the adversary, using only black-box access to
the target classification model. We first devise a novel confidence score-based
model inversion attribute inference attack that significantly outperforms the
state-of-the-art. We then introduce a label-only model inversion attack that
relies only on the model's predicted labels but still matches our confidence
score-based attack in terms of attack effectiveness. We also extend our attacks
to the scenario where some of the other (non-sensitive) attributes of a target
record are unknown to the adversary. We evaluate our attacks on two types of
machine learning models, decision tree and deep neural network, trained on
three real datasets. Moreover, we empirically demonstrate the disparate
vulnerability of model inversion attacks, i.e., specific groups in the training
dataset (grouped by gender, race, etc.) could be more vulnerable to model
inversion attacks.
Related papers
- Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - When Machine Learning Models Leak: An Exploration of Synthetic Training Data [0.0]
We investigate an attack on a machine learning model that predicts whether a person or household will relocate in the next two years.
The attack assumes that the attacker can query the model to obtain predictions and that the marginal distribution of the data on which the model was trained is publicly available.
We explore how replacing the original data with synthetic data when training the model impacts how successfully the attacker can infer sensitive attributes.
arXiv Detail & Related papers (2023-10-12T23:47:22Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
In this work, we ask whether it is possible to infer data-agnostic victim model (VM) information from data-specific adversarial instances.
We collect a dataset of adversarial attacks across 7 attack types generated from 135 victim models.
We show that a simple, supervised model parsing network (MPN) is able to infer VM attributes from unseen adversarial attacks.
arXiv Detail & Related papers (2023-03-13T21:21:49Z) - Inferring Sensitive Attributes from Model Explanations [0.685316573653194]
dependency of explanations on input raises privacy concerns for sensitive user data.
We design the first attribute inference attack against model explanations in two threat models.
We show that an adversary can successfully infer the value of sensitive attributes from explanations in both the threat models accurately.
arXiv Detail & Related papers (2022-08-21T21:31:19Z) - Property Inference Attacks on Convolutional Neural Networks: Influence
and Implications of Target Model's Complexity [1.2891210250935143]
Property Inference Attacks aim to infer from a given model properties about the training dataset seemingly unrelated to the model's primary goal.
This paper investigates the influence of the target model's complexity on the accuracy of this type of attack.
Our findings reveal that the risk of a privacy breach is present independently of the target model's complexity.
arXiv Detail & Related papers (2021-04-27T09:19:36Z) - Black-box Model Inversion Attribute Inference Attacks on Classification
Models [32.757792981935815]
We focus on one kind of model inversion attacks, where the adversary knows non-sensitive attributes about instances in the training data.
We devise two novel model inversion attribute inference attacks -- confidence modeling-based attack and confidence score-based attack.
We evaluate our attacks on two types of machine learning models, decision tree and deep neural network, trained with two real datasets.
arXiv Detail & Related papers (2020-12-07T01:14:19Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.