Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
- URL: http://arxiv.org/abs/2211.12005v3
- Date: Wed, 12 Apr 2023 11:04:04 GMT
- Title: Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
- Authors: Sizhe Chen, Geng Yuan, Xinwen Cheng, Yifan Gong, Minghai Qin, Yanzhi
Wang, Xiaolin Huang
- Abstract summary: Self-ensemble protection (SEP) is proposed to prevent training good models on the data.
SEP is verified to be a new state-of-the-art, e.g., our small perturbations reduce the accuracy of a CIFAR-10 ResNet18 from 94.56% to 14.68%, compared to 41.35% by the best-known method.
- Score: 41.45649235969172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As data becomes increasingly vital, a company would be very cautious about
releasing data, because the competitors could use it to train high-performance
models, thereby posing a tremendous threat to the company's commercial
competence. To prevent training good models on the data, we could add
imperceptible perturbations to it. Since such perturbations aim at hurting the
entire training process, they should reflect the vulnerability of DNN training,
rather than that of a single model. Based on this new idea, we seek perturbed
examples that are always unrecognized (never correctly classified) in training.
In this paper, we uncover them by model checkpoints' gradients, forming the
proposed self-ensemble protection (SEP), which is very effective because (1)
learning on examples ignored during normal training tends to yield DNNs
ignoring normal examples; (2) checkpoints' cross-model gradients are close to
orthogonal, meaning that they are as diverse as DNNs with different
architectures. That is, our amazing performance of ensemble only requires the
computation of training one model. By extensive experiments with 9 baselines on
3 datasets and 5 architectures, SEP is verified to be a new state-of-the-art,
e.g., our small $\ell_\infty=2/255$ perturbations reduce the accuracy of a
CIFAR-10 ResNet18 from 94.56% to 14.68%, compared to 41.35% by the best-known
method. Code is available at https://github.com/Sizhe-Chen/SEP.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free
Ensembles of DNNs [9.010643838773477]
We introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data.
We show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously appreciated.
We use our observations to construct a new ensemble method, based solely on the training history of a single network, which provides significant improvement without any additional cost in training time.
arXiv Detail & Related papers (2023-10-17T09:22:22Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Adversarial Unlearning: Reducing Confidence Along Adversarial Directions [88.46039795134993]
We propose a complementary regularization strategy that reduces confidence on self-generated examples.
The method, which we call RCAD, aims to reduce confidence on out-of-distribution examples lying along directions adversarially chosen to increase training loss.
Despite its simplicity, we find on many classification benchmarks that RCAD can be added to existing techniques to increase test accuracy by 1-3% in absolute value.
arXiv Detail & Related papers (2022-06-03T02:26:24Z) - One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks [28.502489028888608]
Unlearnable examples (ULEs) aim to protect data from unauthorized usage for training DNNs.
In adversarial training, the unlearnability of error-minimizing noise will severely degrade.
We propose a novel model-free method, named emphOne-Pixel Shortcut, which only perturbs a single pixel of each image and makes the dataset unlearnable.
arXiv Detail & Related papers (2022-05-24T15:17:52Z) - DeepObliviate: A Powerful Charm for Erasing Data Residual Memory in Deep
Neural Networks [7.687838702806964]
We propose an approach, dubbed as DeepObliviate, to implement machine unlearning efficiently.
Our approach improves the original training process by storing intermediate models on the hard disk.
Compared to the method of retraining from scratch, our approach can achieve 99.0%, 95.0%, 91.9%, 96.7%, 74.1% accuracy rates and 66.7$times$, 75.0$times$, 33.3$times$, 29.4$times$, 13.7$times$ speedups.
arXiv Detail & Related papers (2021-05-13T12:02:04Z) - To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data.
We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.