Bilateral Dependency Optimization: Defending Against Model-inversion
Attacks
- URL: http://arxiv.org/abs/2206.05483v1
- Date: Sat, 11 Jun 2022 10:07:03 GMT
- Title: Bilateral Dependency Optimization: Defending Against Model-inversion
Attacks
- Authors: Xiong Peng, Feng Liu, Jingfen Zhang, Long Lan, Junjie Ye, Tongliang
Liu, Bo Han
- Abstract summary: We propose a bilateral dependency optimization (BiDO) strategy to defend against model-inversion attacks.
BiDO achieves the state-of-the-art defense performance for a variety of datasets, classifiers, and MI attacks.
- Score: 61.78426165008083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Through using only a well-trained classifier, model-inversion (MI) attacks
can recover the data used for training the classifier, leading to the privacy
leakage of the training data. To defend against MI attacks, previous work
utilizes a unilateral dependency optimization strategy, i.e., minimizing the
dependency between inputs (i.e., features) and outputs (i.e., labels) during
training the classifier. However, such a minimization process conflicts with
minimizing the supervised loss that aims to maximize the dependency between
inputs and outputs, causing an explicit trade-off between model robustness
against MI attacks and model utility on classification tasks. In this paper, we
aim to minimize the dependency between the latent representations and the
inputs while maximizing the dependency between latent representations and the
outputs, named a bilateral dependency optimization (BiDO) strategy. In
particular, we use the dependency constraints as a universally applicable
regularizer in addition to commonly used losses for deep neural networks (e.g.,
cross-entropy), which can be instantiated with appropriate dependency criteria
according to different tasks. To verify the efficacy of our strategy, we
propose two implementations of BiDO, by using two different dependency
measures: BiDO with constrained covariance (BiDO-COCO) and BiDO with
Hilbert-Schmidt Independence Criterion (BiDO-HSIC). Experiments show that BiDO
achieves the state-of-the-art defense performance for a variety of datasets,
classifiers, and MI attacks while suffering a minor classification-accuracy
drop compared to the well-trained classifier with no defense, which lights up a
novel road to defend against MI attacks.
Related papers
- Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.