Related papers: On Computing Pairwise Statistics with Local Differential Privacy

On Computing Pairwise Statistics with Local Differential Privacy

URL: http://arxiv.org/abs/2406.16305v1
Date: Mon, 24 Jun 2024 04:06:09 GMT
Title: On Computing Pairwise Statistics with Local Differential Privacy
Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Adam Sealfon,
Abstract summary: We study the problem of computing pairwise statistics, i.e., ones of the form $binomn2-1 sum_i ne j f(x_i, x_j)$, where $x_i$ denotes the input to the $i$th user, with differential privacy (DP) in the local model. This formulation captures important metrics such as Kendall's $tau$ coefficient, Area Under Curve, Gini's mean difference, Gini's entropy, etc.
Score: 55.81991984375959
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of computing pairwise statistics, i.e., ones of the form $\binom{n}{2}^{-1} \sum_{i \ne j} f(x_i, x_j)$, where $x_i$ denotes the input to the $i$th user, with differential privacy (DP) in the local model. This formulation captures important metrics such as Kendall's $\tau$ coefficient, Area Under Curve, Gini's mean difference, Gini's entropy, etc. We give several novel and generic algorithms for the problem, leveraging techniques from DP algorithms for linear queries.

Related papers

Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization [65.8915778873691]
conditional distributions is a central problem in machine learning. We propose a new learning paradigm that integrates both paired and unpaired data. Our approach also connects intriguingly with inverse entropic optimal transport (OT)
arXiv Detail & Related papers (2024-10-03T16:12:59Z)
Private Mean Estimation with Person-Level Differential Privacy [6.621676316292624]
We study person-level differentially private mean estimation in the case where each person holds multiple samples. We give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP.
arXiv Detail & Related papers (2024-05-30T18:20:35Z)
Smooth Sensitivity for Geo-Privacy [17.835910182900985]
Local model of differential privacy (LDP) is the predominant model under which the problem has been studied. Geo-Privacy (GP) stipulates that the level of distinguishability be proportional to $mathrmdist(x_i, x_i')$. We generalize the smooth sensitivity framework from Differential Privacy to Geo-Privacy, which allows us to add noise tailored to the hardness of the given instance.
arXiv Detail & Related papers (2024-05-10T08:32:07Z)
Efficiently Computing Similarities to Private Datasets [19.99000806126529]
Many methods in differentially private model training rely on computing the similarity between a query point (such as public or synthetic data) and private data. We study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X subset mathbbRd$, output a differentially private (DP) data structure which approximates $sum_x in X f(x,y)$ for any query $y$.
arXiv Detail & Related papers (2024-03-13T19:19:19Z)
Near-Optimal Differentially Private k-Core Decomposition [2.859324824091086]
We show that an $eps$-edge differentially private algorithm for $k$-core decomposition outputs the core numbers with no multiplicative error and $O(textlog(n)/eps)$ additive error. This improves upon previous work by a factor of 2 in the multiplicative error, while giving near-optimal additive error.
arXiv Detail & Related papers (2023-12-12T20:09:07Z)
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We present an algorithm that tackles newthis problem in its most general distribution-independent setting. This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z)
Differentially Private Clustering in Data Streams [65.78882209673885]
We present a differentially private streaming clustering framework which only requires an offline DP coreset or clustering algorithm as a blackbox. Our framework is also differentially private under the continual release setting, i.e., the union of outputs of our algorithms at every timestamp is always differentially private.
arXiv Detail & Related papers (2023-07-14T16:11:22Z)
Scalable Differentially Private Clustering via Hierarchically Separated Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee. Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z)
DP-PCA: Statistically Optimal and Differentially Private PCA [44.22319983246645]
DP-PCA is a single-pass algorithm that overcomes both limitations. For sub-Gaussian data, we provide nearly optimal statistical error rates even for $n=tilde O(d)$.
arXiv Detail & Related papers (2022-05-27T02:02:17Z)
Locally differentially private estimation of nonlinear functionals of discrete distributions [9.028773906859541]
We study the problem of estimating non-linear functionals of discrete distributions in the context of local differential privacy. Only $alpha$-locally differentially private (LDP) samples are publicly available, where the term 'local' means that each $z_i$ is produced using one individual $x_i$. We describe the behavior of the quadratic risk for estimating the power sum functional $F_gamma = sum_k=1K p_kgamma$, $gamma >0$ as a function of $K,, n
arXiv Detail & Related papers (2021-07-08T16:11:10Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.