Input Perturbation: A New Paradigm between Central and Local
Differential Privacy
- URL: http://arxiv.org/abs/2002.08570v1
- Date: Thu, 20 Feb 2020 05:20:02 GMT
- Title: Input Perturbation: A New Paradigm between Central and Local
Differential Privacy
- Authors: Yilin Kang, Yong Liu, Ben Niu, Xinyi Tong, Likun Zhang and Weiping
Wang
- Abstract summary: We study the textitinput perturbation method in differentially private empirical risk minimization (DP-ERM)
We achieve ($epsilon$,$delta$)-differential privacy on the final model, along with some kind of privacy on the original data.
Our method achieves almost the same (or even better) performance as some of the best previous central methods with more protections on privacy.
- Score: 15.943736378291154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditionally, there are two models on differential privacy: the central
model and the local model. The central model focuses on the machine learning
model and the local model focuses on the training data. In this paper, we study
the \textit{input perturbation} method in differentially private empirical risk
minimization (DP-ERM), preserving privacy of the central model. By adding noise
to the original training data and training with the `perturbed data', we
achieve ($\epsilon$,$\delta$)-differential privacy on the final model, along
with some kind of privacy on the original data. We observe that there is an
interesting connection between the local model and the central model: the
perturbation on the original data causes the perturbation on the gradient, and
finally the model parameters. This observation means that our method builds a
bridge between local and central model, protecting the data, the gradient and
the model simultaneously, which is more superior than previous central methods.
Detailed theoretical analysis and experiments show that our method achieves
almost the same (or even better) performance as some of the best previous
central methods with more protections on privacy, which is an attractive
result. Moreover, we extend our method to a more general case: the loss
function satisfies the Polyak-Lojasiewicz condition, which is more general than
strong convexity, the constraint on the loss function in most previous work.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Gradient Inversion of Federated Diffusion Models [4.1355611383748005]
Diffusion models are becoming defector generative models, which generate exceptionally high-resolution image data.
In this paper, we study the privacy risk of gradient inversion attacks.
We propose a triple-optimization GIDM+ that coordinates the optimization of the unknown data.
arXiv Detail & Related papers (2024-05-30T18:00:03Z) - Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning [112.69497636932955]
Federated learning aims to train models across different clients without the sharing of data for privacy considerations.
We study how data heterogeneity affects the representations of the globally aggregated models.
We propose sc FedDecorr, a novel method that can effectively mitigate dimensional collapse in federated learning.
arXiv Detail & Related papers (2022-10-01T09:04:17Z) - Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response [6.260747047974035]
Most differentially private (DP) algorithms assume a third party inserts noise to queries made on datasets, or a local model where the users locally perturb their data.
The recently proposed shuffle model is an intermediate framework between the central and the local paradigms.
We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized.
arXiv Detail & Related papers (2022-05-18T10:44:28Z) - Unsupervised Deep Learning Meets Chan-Vese Model [77.24463525356566]
We propose an unsupervised image segmentation approach that integrates the Chan-Vese (CV) model with deep neural networks.
Our basic idea is to apply a deep neural network that maps the image into a latent space to alleviate the violation of the piecewise constant assumption in image space.
arXiv Detail & Related papers (2022-04-14T13:23:57Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - The Limits of Pan Privacy and Shuffle Privacy for Learning and
Estimation [3.2942333712377083]
We show that for a variety of high-dimensional learning and estimation problems, the shuffle model and the pan-private model incur an exponential price in sample complexity relative to the central model.
Our work gives the first non-trivial lower bounds for these problems for both the pan-private model and the general multi-message shuffle model.
arXiv Detail & Related papers (2020-09-17T01:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.