Related papers: Reducing Risk of Model Inversion Using Privacy-Guided Training

Reducing Risk of Model Inversion Using Privacy-Guided Training

URL: http://arxiv.org/abs/2006.15877v1
Date: Mon, 29 Jun 2020 09:02:16 GMT
Title: Reducing Risk of Model Inversion Using Privacy-Guided Training
Authors: Abigail Goldsteen, Gilad Ezov, Ariel Farkash
Abstract summary: Several recent attacks have been able to infer sensitive information from trained models. We present a solution for countering model inversion attacks in tree-based models.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models often pose a threat to the privacy of individuals whose data is part of the training set. Several recent attacks have been able to infer sensitive information from trained models, including model inversion or attribute inference attacks. These attacks are able to reveal the values of certain sensitive features of individuals who participated in training the model. It has also been shown that several factors can contribute to an increased risk of model inversion, including feature influence. We observe that not all features necessarily share the same level of privacy or sensitivity. In many cases, certain features used to train a model are considered especially sensitive and therefore propitious candidates for inversion. We present a solution for countering model inversion attacks in tree-based models, by reducing the influence of sensitive features in these models. This is an avenue that has not yet been thoroughly investigated, with only very nascent previous attempts at using this as a countermeasure against attribute inference. Our work shows that, in many cases, it is possible to train a model in different ways, resulting in different influence levels of the various features, without necessarily harming the model's accuracy. We are able to utilize this fact to train models in a manner that reduces the model's reliance on the most sensitive features, while increasing the importance of less sensitive features. Our evaluation confirms that training models in this manner reduces the risk of inference for those features, as demonstrated through several black-box and white-box attacks.

Related papers

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack. When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model. Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
Security and Privacy Challenges in Deep Learning Models [0.0]
Deep learning models can be subjected to various attacks that compromise model security and data privacy. Model Extraction Attacks, Model Inversion attacks, and Adversarial attacks are discussed. Data Poisoning Attacks add harmful data to the training set, disrupting the learning process and reducing the reliability of the deep learning mode.
arXiv Detail & Related papers (2023-11-23T00:26:14Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
MOVE: Effective and Harmless Ownership Verification via Embedded External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously. We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models [22.569705869469814]
We focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data. We devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art. We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary.
arXiv Detail & Related papers (2022-01-23T21:27:20Z)
Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world. This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z)
Quantifying and Mitigating Privacy Risks of Contrastive Learning [4.909548818641602]
We perform the first privacy analysis of contrastive learning through the lens of membership inference and attribute inference. Our results show that contrastive models are less vulnerable to membership inference attacks but more vulnerable to attribute inference attacks compared to supervised models. To remedy this situation, we propose the first privacy-preserving contrastive learning mechanism, namely Talos.
arXiv Detail & Related papers (2021-02-08T11:38:11Z)
Adversarial Learning with Cost-Sensitive Classes [7.6596177815175475]
It is necessary to improve the performance of some special classes or to particularly protect them from attacks in adversarial learning. This paper proposes a framework combining cost-sensitive classification and adversarial learning together to train a model that can distinguish between protected and unprotected classes.
arXiv Detail & Related papers (2021-01-29T03:15:40Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.