Disrupting Model Training with Adversarial Shortcuts
- URL: http://arxiv.org/abs/2106.06654v1
- Date: Sat, 12 Jun 2021 01:04:41 GMT
- Title: Disrupting Model Training with Adversarial Shortcuts
- Authors: Ivan Evtimov and Ian Covert and Aditya Kusupati and Tadayoshi Kohno
- Abstract summary: We present a proof-of-concept approach for the image classification setting.
We propose methods based on the notion of adversarial shortcuts, which encourage models to rely on non-robust signals rather than semantic features.
- Score: 12.31803688544684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When data is publicly released for human consumption, it is unclear how to
prevent its unauthorized usage for machine learning purposes. Successful model
training may be preventable with carefully designed dataset modifications, and
we present a proof-of-concept approach for the image classification setting. We
propose methods based on the notion of adversarial shortcuts, which encourage
models to rely on non-robust signals rather than semantic features, and our
experiments demonstrate that these measures successfully prevent deep learning
models from achieving high accuracy on real, unmodified data examples.
Related papers
- Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Nonlinear Transformations Against Unlearnable Datasets [4.876873339297269]
Automated scraping stands out as a common method for collecting data in deep learning models without the authorization of data owners.
Recent studies have begun to tackle the privacy concerns associated with this data collection method.
The data generated by those approaches, called "unlearnable" examples, are prevented "learning" by deep learning models.
arXiv Detail & Related papers (2024-06-05T03:00:47Z) - Segue: Side-information Guided Generative Unlearnable Examples for
Facial Privacy Protection in Real World [64.4289385463226]
We propose Segue: Side-information guided generative unlearnable examples.
To improve transferability, we introduce side information such as true labels and pseudo labels.
It can resist JPEG compression, adversarial training, and some standard data augmentations.
arXiv Detail & Related papers (2023-10-24T06:22:37Z) - Flew Over Learning Trap: Learn Unlearnable Samples by Progressive Staged
Training [28.17601195439716]
Unlearning techniques generate unlearnable samples by adding imperceptible perturbations to data for public publishing.
We make the in-depth analysis and observe that models can learn both image features and perturbation features of unlearnable samples at an early stage.
We propose Progressive Staged Training to effectively prevent models from overfitting in learning perturbation features.
arXiv Detail & Related papers (2023-06-03T09:36:16Z) - Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective.
We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Monitoring Model Deterioration with Explainable Uncertainty Estimation
via Non-parametric Bootstrap [0.0]
Monitoring machine learning models once they are deployed is challenging.
It is even more challenging to decide when to retrain models in real-case scenarios when labeled data is beyond reach.
In this work, we use non-parametric bootstrapped uncertainty estimates and SHAP values to provide explainable uncertainty estimation.
arXiv Detail & Related papers (2022-01-27T17:23:04Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Data-Free Adversarial Perturbations for Practical Black-Box Attack [25.44755251319056]
We present a data-free method for crafting adversarial perturbations that can fool a target model without any knowledge about the training data distribution.
Our method empirically shows that current deep learning models are still at risk even when the attackers do not have access to training data.
arXiv Detail & Related papers (2020-03-03T02:22:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.