What Can We Learn from Unlearnable Datasets?
- URL: http://arxiv.org/abs/2305.19254v3
- Date: Tue, 7 Nov 2023 21:52:05 GMT
- Title: What Can We Learn from Unlearnable Datasets?
- Authors: Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom
Goldstein
- Abstract summary: Unlearnable datasets have the potential to protect data privacy by preventing deep neural networks from generalizing.
It is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization.
In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured.
- Score: 107.12337511216228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In an era of widespread web scraping, unlearnable dataset methods have the
potential to protect data privacy by preventing deep neural networks from
generalizing. But in addition to a number of practical limitations that make
their use unlikely, we make a number of findings that call into question their
ability to safeguard data. First, it is widely believed that neural networks
trained on unlearnable datasets only learn shortcuts, simpler rules that are
not useful for generalization. In contrast, we find that networks actually can
learn useful features that can be reweighed for high test performance,
suggesting that image protection is not assured. Unlearnable datasets are also
believed to induce learning shortcuts through linear separability of added
perturbations. We provide a counterexample, demonstrating that linear
separability of perturbations is not a necessary condition. To emphasize why
linearly separable perturbations should not be relied upon, we propose an
orthogonal projection attack which allows learning from unlearnable datasets
published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less
complex than recently proposed techniques.
Related papers
- Unlearnable 3D Point Clouds: Class-wise Transformation Is All You Need [24.18942067770636]
Unlearnable strategies have been proposed to prevent unauthorized users from training on the 2D image data.
We propose the first integral unlearnable framework for 3D point clouds including two processes.
Both theoretical and empirical results demonstrate the effectiveness of our proposed unlearnable framework.
arXiv Detail & Related papers (2024-10-04T17:49:32Z) - Nonlinear Transformations Against Unlearnable Datasets [4.876873339297269]
Automated scraping stands out as a common method for collecting data in deep learning models without the authorization of data owners.
Recent studies have begun to tackle the privacy concerns associated with this data collection method.
The data generated by those approaches, called "unlearnable" examples, are prevented "learning" by deep learning models.
arXiv Detail & Related papers (2024-06-05T03:00:47Z) - Ungeneralizable Examples [70.76487163068109]
Current approaches to creating unlearnable data involve incorporating small, specially designed noises.
We extend the concept of unlearnable data to conditional data learnability and introduce textbfUntextbfGeneralizable textbfExamples (UGEs)
UGEs exhibit learnability for authorized users while maintaining unlearnability for potential hackers.
arXiv Detail & Related papers (2024-04-22T09:29:14Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free
Replay [52.251188477192336]
Few-shot class-incremental learning (FSCIL) has been proposed aiming to enable a deep learning system to incrementally learn new classes with limited data.
We show through empirical results that adopting the data replay is surprisingly favorable.
We propose using data-free replay that can synthesize data by a generator without accessing real data.
arXiv Detail & Related papers (2022-07-22T17:30:51Z) - Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond [59.94347858883343]
This tutorial covers the basic steps as well as more recent options to improve models.
It can be particularly useful in datasets that are not as well-prepared as those in challenges.
arXiv Detail & Related papers (2021-09-06T21:31:42Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Unsupervised Deep Learning by Injecting Low-Rank and Sparse Priors [5.5586788751870175]
We focus on employing sparsity-inducing priors in deep learning to encourage the network to concisely capture the nature of high-dimensional data.
We demonstrate unsupervised learning of U-Net for background subtraction using low-rank and sparse priors.
arXiv Detail & Related papers (2021-06-21T08:41:02Z) - Over-parametrized neural networks as under-determined linear systems [31.69089186688224]
We show that it is unsurprising simple neural networks can achieve zero training loss.
We show that kernels typically associated with the ReLU activation function have fundamental flaws.
We propose new activation functions that avoid the pitfalls of ReLU in that they admit zero training loss solutions for any set of distinct data points.
arXiv Detail & Related papers (2020-10-29T21:43:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.