Mixed-Privacy Forgetting in Deep Networks
- URL: http://arxiv.org/abs/2012.13431v1
- Date: Thu, 24 Dec 2020 19:34:56 GMT
- Title: Mixed-Privacy Forgetting in Deep Networks
- Authors: Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia
Polito, Stefano Soatto
- Abstract summary: We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
- Score: 114.3840147070712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We show that the influence of a subset of the training samples can be removed
-- or "forgotten" -- from the weights of a network trained on large-scale image
classification tasks, and we provide strong computable bounds on the amount of
remaining information after forgetting. Inspired by real-world applications of
forgetting techniques, we introduce a novel notion of forgetting in
mixed-privacy setting, where we know that a "core" subset of the training
samples does not need to be forgotten. While this variation of the problem is
conceptually simple, we show that working in this setting significantly
improves the accuracy and guarantees of forgetting methods applied to vision
classification tasks. Moreover, our method allows efficient removal of all
information contained in non-core data by simply setting to zero a subset of
the weights with minimal loss in performance. We achieve these results by
replacing a standard deep network with a suitable linear approximation. With
opportune changes to the network architecture and training procedure, we show
that such linear approximation achieves comparable performance to the original
network and that the forgetting problem becomes quadratic and can be solved
efficiently even for large models. Unlike previous forgetting methods on deep
networks, ours can achieve close to the state-of-the-art accuracy on large
scale vision tasks. In particular, we show that our method allows forgetting
without having to trade off the model accuracy.
Related papers
- KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks.
We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process.
Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z) - Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic.
We first benchmark the performance of these models under different perturbations and datasets.
We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z) - One-Shot Pruning for Fast-adapting Pre-trained Models on Devices [28.696989086706186]
Large-scale pre-trained models have been remarkably successful in resolving downstream tasks.
deploying these models on low-capability devices still requires an effective approach, such as model pruning.
We present a scalable one-shot pruning method that leverages pruned knowledge of similar tasks to extract a sub-network from the pre-trained model for a new task.
arXiv Detail & Related papers (2023-07-10T06:44:47Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - HALO: Learning to Prune Neural Networks with Shrinkage [5.283963846188862]
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data.
Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network.
We present a novel penalty called Hierarchical Adaptive Lasso which learns to adaptively sparsify weights of a given network via trainable parameters.
arXiv Detail & Related papers (2020-08-24T04:08:48Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Differentiable Sparsification for Deep Neural Networks [0.0]
We propose a fully differentiable sparsification method for deep neural networks.
The proposed method can learn both the sparsified structure and weights of a network in an end-to-end manner.
To the best of our knowledge, this is the first fully differentiable sparsification method.
arXiv Detail & Related papers (2019-10-08T03:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.