Differentially Private Clustering: Tight Approximation Ratios
- URL: http://arxiv.org/abs/2008.08007v1
- Date: Tue, 18 Aug 2020 16:22:06 GMT
- Title: Differentially Private Clustering: Tight Approximation Ratios
- Authors: Badih Ghazi, Ravi Kumar, Pasin Manurangsi
- Abstract summary: We give efficient differentially private algorithms for basic clustering problems.
Our results imply an improved algorithm for the Sample and Aggregate privacy framework.
One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
- Score: 57.89473217052714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the task of differentially private clustering. For several basic
clustering problems, including Euclidean DensestBall, 1-Cluster, k-means, and
k-median, we give efficient differentially private algorithms that achieve
essentially the same approximation ratios as those that can be obtained by any
non-private algorithm, while incurring only small additive errors. This
improves upon existing efficient algorithms that only achieve some large
constant approximation factors.
Our results also imply an improved algorithm for the Sample and Aggregate
privacy framework. Furthermore, we show that one of the tools used in our
1-Cluster algorithm can be employed to get a faster quantum algorithm for
ClosestPair in a moderate number of dimensions.
Related papers
- From Large to Small Datasets: Size Generalization for Clustering
Algorithm Selection [12.993073967843292]
We study a problem in a semi-supervised setting, with an unknown ground-truth clustering.
We introduce a notion of size generalization for clustering algorithm accuracy.
We use a subsample of as little as 5% of the data to identify which algorithm is best on the full dataset.
arXiv Detail & Related papers (2024-02-22T06:53:35Z) - Fast Optimal Locally Private Mean Estimation via Random Projections [58.603579803010796]
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball.
We propose a new algorithmic framework, ProjUnit, for private mean estimation.
Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.
arXiv Detail & Related papers (2023-06-07T14:07:35Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points.
We provide implementable differentially private clustering algorithms that provide utility when the data is "easy"
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z) - Differentially Private Correlation Clustering [23.93805663525436]
Correlation clustering is a widely used technique in unsupervised machine learning.
Motivated by applications where individual privacy is a concern, we initiate the study of differentially private correlation clustering.
arXiv Detail & Related papers (2021-02-17T17:27:48Z) - Clustering of Big Data with Mixed Features [3.3504365823045044]
We develop a new clustering algorithm for large data of mixed type.
The algorithm is capable of detecting outliers and clusters of relatively lower density values.
We present experimental results to verify that our algorithm works well in practice.
arXiv Detail & Related papers (2020-11-11T19:54:38Z) - Differentially Private Clustering via Maximum Coverage [7.059472280274009]
We study the problem of clustering in metric spaces while preserving the privacy of individual data.
We present differential algorithms with constant multiplicative error and lower additive error.
arXiv Detail & Related papers (2020-08-27T22:11:18Z) - Differentially Private k-Means Clustering with Guaranteed Convergence [5.335316436366718]
Iterative clustering algorithms help us to learn the insights behind the data.
It may allow adversaries to infer the privacy of individuals with some background knowledge.
To protect individual privacy against such an inference attack, preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied.
arXiv Detail & Related papers (2020-02-03T22:53:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.