Related papers: BUDS: Balancing Utility and Differential Privacy by Shuffling

BUDS: Balancing Utility and Differential Privacy by Shuffling

URL: http://arxiv.org/abs/2006.04125v1
Date: Sun, 7 Jun 2020 11:39:13 GMT
Title: BUDS: Balancing Utility and Differential Privacy by Shuffling
Authors: Poushali Sengupta, Sudipta Paul, Subhankar Mishra
Abstract summary: Balancing utility and differential privacy by shuffling or textitBUDS is an approach towards crowd-sourced, statistical databases. New algorithm is proposed using one-hot encoding and iterative shuffling with the loss estimation and risk minimization techniques. During empirical test of balanced utility and privacy, BUDS produces $epsilon = 0.02$ which is a very promising result.
Score: 3.618133010429131
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Balancing utility and differential privacy by shuffling or \textit{BUDS} is an approach towards crowd-sourced, statistical databases, with strong privacy and utility balance using differential privacy theory. Here, a novel algorithm is proposed using one-hot encoding and iterative shuffling with the loss estimation and risk minimization techniques, to balance both the utility and privacy. In this work, after collecting one-hot encoded data from different sources and clients, a step of novel attribute shuffling technique using iterative shuffling (based on the query asked by the analyst) and loss estimation with an updation function and risk minimization produces a utility and privacy balanced differential private report. During empirical test of balanced utility and privacy, BUDS produces $\epsilon = 0.02$ which is a very promising result. Our algorithm maintains a privacy bound of $\epsilon = ln [t/((n_1 - 1)^S)]$ and loss bound of $c' \bigg|e^{ln[t/((n_1 - 1)^S)]} - 1\bigg|$.

Related papers

$(ε, δ)$-Differentially Private Partial Least Squares Regression [1.8666451604540077]
We propose an $(epsilon, delta)$-differentially private PLS (edPLS) algorithm to ensure the privacy of the data underlying the model. Experimental results demonstrate that edPLS effectively renders privacy attacks, aimed at recovering unique sources of variability in the training data.
arXiv Detail & Related papers (2024-12-12T10:49:55Z)
Optimized Tradeoffs for Private Prediction with Majority Ensembling [59.99331405291337]
We introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. DaRRM is parameterized by a data-dependent noise function $gamma$, and enables efficient utility optimization over the class of all private algorithms. We show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility.
arXiv Detail & Related papers (2024-11-27T00:48:48Z)
On Computing Pairwise Statistics with Local Differential Privacy [55.81991984375959]
We study the problem of computing pairwise statistics, i.e., ones of the form $binomn2-1 sum_i ne j f(x_i, x_j)$, where $x_i$ denotes the input to the $i$th user, with differential privacy (DP) in the local model. This formulation captures important metrics such as Kendall's $tau$ coefficient, Area Under Curve, Gini's mean difference, Gini's entropy, etc.
arXiv Detail & Related papers (2024-06-24T04:06:09Z)
A Generalized Shuffle Framework for Privacy Amplification: Strengthening Privacy Guarantees and Enhancing Utility [4.7712438974100255]
We show how to shuffle $(epsilon_i,delta_i)$-PLDP setting with personalized privacy parameters. We prove that shuffled $(epsilon_i,delta_i)$-PLDP process approximately preserves $mu$-Gaussian Differential Privacy with mu = sqrtfrac2sum_i=1n frac1-delta_i1+eepsilon_i-max_ifrac1-delta_i1+e
arXiv Detail & Related papers (2023-12-22T02:31:46Z)
Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing. We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z)
Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets. In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)
Infinitely Divisible Noise in the Low Privacy Regime [9.39772079241093]
Federated learning, in which training data is distributed among users and never shared, has emerged as a popular approach to privacy-preserving machine learning. We present the first divisible infinitely noise distribution for real-valued data that achieves $varepsilon$-differential privacy.
arXiv Detail & Related papers (2021-10-13T08:16:43Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Output Perturbation for Differentially Private Convex Optimization with Improved Population Loss Bounds, Runtimes and Applications to Private Adversarial Training [12.386462516398469]
Finding efficient, easily implementable differentially private (DP) algorithms that offer strong excess risk bounds is an important problem in modern machine learning. We provide the tightest known $(epsilon, 0)$-DP population loss bounds and fastest runtimes under the presence of smoothness and strong convexity. We apply our theory to two learning frameworks: tilted ERM and adversarial learning frameworks.
arXiv Detail & Related papers (2021-02-09T08:47:06Z)
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data. Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.