Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization
- URL: http://arxiv.org/abs/2009.05241v2
- Date: Tue, 22 Sep 2020 04:35:57 GMT
- Title: Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization
- Authors: Tianhao Wang, Yuheng Zhang, Ruoxi Jia
- Abstract summary: This paper studies defense mechanisms against model inversion (MI) attacks.
MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.
We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
- Score: 12.079281416410227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies defense mechanisms against model inversion (MI) attacks --
a type of privacy attacks aimed at inferring information about the training
data distribution given the access to a target machine learning model. Existing
defense mechanisms rely on model-specific heuristics or noise injection. While
being able to mitigate attacks, existing methods significantly hinder model
performance. There remains a question of how to design a defense mechanism that
is applicable to a variety of models and achieves better utility-privacy
tradeoff. In this paper, we propose the Mutual Information Regularization based
Defense (MID) against MI attacks. The key idea is to limit the information
about the model input contained in the prediction, thereby limiting the ability
of an adversary to infer the private training attributes from the model
prediction. Our defense principle is model-agnostic and we present tractable
approximations to the regularizer for linear regression, decision trees, and
neural networks, which have been successfully attacked by prior work if not
attached with any defenses. We present a formal study of MI attacks by devising
a rigorous game-based definition and quantifying the associated information
leakage. Our theoretical analysis sheds light on the inefficacy of DP in
defending against MI attacks, which has been empirically observed in several
prior works. Our experiments demonstrate that MID leads to state-of-the-art
performance for a variety of MI attacks, target models and datasets.
Related papers
- CALoR: Towards Comprehensive Model Inversion Defense [43.2642796582236]
Model Inversion Attacks (MIAs) aim at recovering privacy-sensitive training data from the knowledge encoded in released machine learning models.
Recent advances in the MIA field have significantly enhanced the attack performance under multiple scenarios.
We propose a robust defense mechanism, integrating Confidence Adaptation and Low-Rank compression.
arXiv Detail & Related papers (2024-10-08T08:44:01Z) - Defending against Model Inversion Attacks via Random Erasing [24.04876860999608]
We present a new method to defend against Model Inversion (MI) attacks.
Our idea is based on a novel insight on Random Erasing (RE)
We show that RE can lead to substantial degradation in MI reconstruction quality and attack accuracy.
arXiv Detail & Related papers (2024-09-02T08:37:17Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - DODEM: DOuble DEfense Mechanism Against Adversarial Attacks Towards
Secure Industrial Internet of Things Analytics [8.697883716452385]
We propose a double defense mechanism to detect and mitigate adversarial attacks in I-IoT environments.
We first detect if there is an adversarial attack on a given sample using novelty detection algorithms.
If there is an attack, adversarial retraining provides a more robust model, while we apply standard training for regular samples.
arXiv Detail & Related papers (2023-01-23T22:10:40Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - I Know What You Trained Last Summer: A Survey on Stealing Machine
Learning Models and Defences [0.1031296820074812]
We study model stealing attacks, assessing their performance and exploring corresponding defence techniques in different settings.
We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence based on the goal and available resources.
arXiv Detail & Related papers (2022-06-16T21:16:41Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.