Related papers: Black-box Model Inversion Attribute Inference Attacks on Classification Models

Black-box Model Inversion Attribute Inference Attacks on Classification Models

URL: http://arxiv.org/abs/2012.03404v1
Date: Mon, 7 Dec 2020 01:14:19 GMT
Title: Black-box Model Inversion Attribute Inference Attacks on Classification Models
Authors: Shagufta Mehnaz, Ninghui Li, Elisa Bertino
Abstract summary: We focus on one kind of model inversion attacks, where the adversary knows non-sensitive attributes about instances in the training data. We devise two novel model inversion attribute inference attacks -- confidence modeling-based attack and confidence score-based attack. We evaluate our attacks on two types of machine learning models, decision tree and deep neural network, trained with two real datasets.
Score: 32.757792981935815
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Increasing use of ML technologies in privacy-sensitive domains such as medical diagnoses, lifestyle predictions, and business decisions highlights the need to better understand if these ML technologies are introducing leakages of sensitive and proprietary training data. In this paper, we focus on one kind of model inversion attacks, where the adversary knows non-sensitive attributes about instances in the training data and aims to infer the value of a sensitive attribute unknown to the adversary, using oracle access to the target classification model. We devise two novel model inversion attribute inference attacks -- confidence modeling-based attack and confidence score-based attack, and also extend our attack to the case where some of the other (non-sensitive) attributes are unknown to the adversary. Furthermore, while previous work uses accuracy as the metric to evaluate the effectiveness of attribute inference attacks, we find that accuracy is not informative when the sensitive attribute distribution is unbalanced. We identify two metrics that are better for evaluating attribute inference attacks, namely G-mean and Matthews correlation coefficient (MCC). We evaluate our attacks on two types of machine learning models, decision tree and deep neural network, trained with two real datasets. Experimental results show that our newly proposed attacks significantly outperform the state-of-the-art attacks. Moreover, we empirically show that specific groups in the training dataset (grouped by attributes, e.g., gender, race) could be more vulnerable to model inversion attacks. We also demonstrate that our attacks' performances are not impacted significantly when some of the other (non-sensitive) attributes are also unknown to the adversary.

Related papers

An Efficient Gradient-Based Inference Attack for Federated Learning [0.0]
Federated learning is a machine learning setting that reduces direct data exposure, improving the privacy guarantees of machine learning models.<n>We present a new gradient-based membership inference attack for federated learning scenarios.<n>Our method uses the shadow technique to learn round-wise gradient patterns of the training records, requiring no access to the private dataset.
arXiv Detail & Related papers (2025-12-17T07:10:04Z)
Disparate Privacy Vulnerability: Targeted Attribute Inference Attacks and Defenses [1.740992908651449]
A potential threat arises from an adversary querying trained models using the public, non-sensitive attributes of entities in the training data. We develop a novel inference attack called the disparity inference attack, which targets the identification of high-risk groups within the dataset. We are also the first to introduce a novel and effective disparity mitigation technique that simultaneously preserves model performance and prevents any risk of targeted attacks.
arXiv Detail & Related papers (2025-04-05T02:58:37Z)
When Machine Learning Models Leak: An Exploration of Synthetic Training Data [0.0]
We investigate an attack on a machine learning model that predicts whether a person or household will relocate in the next two years. The attack assumes that the attacker can query the model to obtain predictions and that the marginal distribution of the data on which the model was trained is publicly available. We explore how replacing the original data with synthetic data when training the model impacts how successfully the attacker can infer sensitive attributes.
arXiv Detail & Related papers (2023-10-12T23:47:22Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z)
Purifier: Defending Data Inference Attacks via Transforming Confidence Scores [27.330482508047428]
We propose a method, namely PURIFIER, to defend against membership inference attacks. Experiments show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency. PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks.
arXiv Detail & Related papers (2022-12-01T16:09:50Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Dikaios: Privacy Auditing of Algorithmic Fairness via Attribute Inference Attacks [0.5801044612920815]
We propose Dikaios, a privacy auditing tool for fairness algorithms for model builders. We show that our attribute inference attacks with adaptive prediction threshold significantly outperform prior attacks.
arXiv Detail & Related papers (2022-02-04T17:19:59Z)
Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models [22.569705869469814]
We focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data. We devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art. We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary.
arXiv Detail & Related papers (2022-01-23T21:27:20Z)
Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set. We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance. Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z)
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc. We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions. We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples. We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.