Differentially Private Kernel Inducing Points using features from ScatterNets (DP-KIP-ScatterNet) for Privacy Preserving Data Distillation
- URL: http://arxiv.org/abs/2301.13389v2
- Date: Mon, 22 Apr 2024 17:13:07 GMT
- Title: Differentially Private Kernel Inducing Points using features from ScatterNets (DP-KIP-ScatterNet) for Privacy Preserving Data Distillation
- Authors: Margarita Vinaroz, Mi Jung Park,
- Abstract summary: We introduce differentially private kernel inducing points (DP-KIP) for privacy-preserving data distillation.
We find that KIP using infinitely-wide convolutional neural tangent kernels (conv-NTKs) performs better compared to KIP using fully-connected NTKs.
We propose DP-KIP-ScatterNet, which uses the wavelet features from Scattering networks (ScatterNet) instead of those from conv-NTKs.
- Score: 5.041384008847852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data distillation aims to generate a small data set that closely mimics the performance of a given learning algorithm on the original data set. The distilled dataset is hence useful to simplify the training process thanks to its small data size. However, distilled data samples are not necessarily privacy-preserving, even if they are generally humanly indiscernible. To address this limitation, we introduce differentially private kernel inducing points (DP-KIP) for privacy-preserving data distillation. Unlike our original intention to simply apply DP-SGD to the framework of KIP, we find that KIP using infinitely-wide convolutional neural tangent kernels (conv-NTKs) performs better compared to KIP using fully-connected NTKs. However, KIP with conv-NTKs, due to its convolutional and pooling operations, introduces an unbearable computational complexity, requiring hundreds of V100 GPUs in parallel to train, which is impractical and more importantly, such computational resources are inaccessible to many. To overcome this issue, we propose an alternative that does not require pre-training (to avoid a privacy loss) and can well capture complex information on images, as those features from conv-NKTs do, while the computational cost is manageable by a single V100 GPU. To this end, we propose DP-KIP-ScatterNet, which uses the wavelet features from Scattering networks (ScatterNet) instead of those from conv-NTKs, to perform DP-KIP at a reasonable computational cost. We implement DP-KIP-ScatterNet in -- computationally efficient -- JAX and test on several popular image datasets to show its efficacy and its superior performance compared to state-of-the art methods in image data distillation with differential privacy guarantees.
Related papers
- Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption [11.706881389387242]
Homomorphic encryption (FHE) is a viable approach for achieving private inference (PI)
FHE implementation of a CNN faces significant hurdles, primarily due to FHE's substantial computational and memory overhead.
We propose a set of optimizations, which includes GPU/ASIC acceleration, an efficient activation function, and an optimized packing scheme.
arXiv Detail & Related papers (2023-10-25T10:24:35Z) - Efficient Privacy-Preserving Convolutional Spiking Neural Networks with
FHE [1.437446768735628]
Homomorphic Encryption (FHE) is a key technology for privacy-preserving computation.
FHE has limitations in processing continuous non-polynomial functions.
We present a framework called FHE-DiCSNN for homomorphic SNNs.
FHE-DiCSNN achieves an accuracy of 97.94% on ciphertexts, with a loss of only 0.53% compared to the original network's accuracy of 98.47%.
arXiv Detail & Related papers (2023-09-16T15:37:18Z) - CUDA: Convolution-based Unlearnable Datasets [77.70422525613084]
Large-scale training of modern deep learning models heavily relies on publicly available data on the web.
Recent works aim to make data for deep learning models by adding small, specially designed noises.
These methods are vulnerable to adversarial training (AT) and/or are computationally heavy.
arXiv Detail & Related papers (2023-03-07T22:57:23Z) - Differentially Private Neural Tangent Kernels for Privacy-Preserving
Data Generation [32.83436754714798]
This work considers the using the features of $textitneural tangent kernels (NTKs)$, more precisely $textitempirical$ NTKs (e-NTKs)
We find that, perhaps surprisingly, the expressiveness of the untrained e-NTK features is comparable to that of the features taken from pre-trained perceptual features using public data.
arXiv Detail & Related papers (2023-03-03T03:00:49Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Dynamic Probabilistic Pruning: A general framework for
hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps)
We refer to this algorithm as Dynamic Probabilistic Pruning (DPP)
We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.