Differentially Private k-Means Clustering with Guaranteed Convergence
- URL: http://arxiv.org/abs/2002.01043v1
- Date: Mon, 3 Feb 2020 22:53:47 GMT
- Title: Differentially Private k-Means Clustering with Guaranteed Convergence
- Authors: Zhigang Lu, Hong Shen
- Abstract summary: Iterative clustering algorithms help us to learn the insights behind the data.
It may allow adversaries to infer the privacy of individuals with some background knowledge.
To protect individual privacy against such an inference attack, preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied.
- Score: 5.335316436366718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Iterative clustering algorithms help us to learn the insights behind the
data. Unfortunately, this may allow adversaries to infer the privacy of
individuals with some background knowledge. In the worst case, the adversaries
know the centroids of an arbitrary iteration and the information of n-1 out of
n items. To protect individual privacy against such an inference attack,
preserving differential privacy (DP) for the iterative clustering algorithms
has been extensively studied in the interactive settings. However, existing
interactive differentially private clustering algorithms suffer from a
non-convergence problem, i.e., these algorithms may not terminate without a
predefined number of iterations. This problem severely impacts the clustering
quality and the efficiency of a differentially private algorithm. To resolve
this problem, in this paper, we propose a novel differentially private
clustering framework in the interactive settings which controls the orientation
of the movement of the centroids over the iterations to ensure the convergence
by injecting DP noise in a selected area. We prove that, in the expected case,
algorithm under our framework converges in at most twice the iterations of
Lloyd's algorithm. We perform experimental evaluations on real-world datasets
to show that our algorithm outperforms the state-of-the-art of the interactive
differentially private clustering algorithms with guaranteed convergence and
better clustering quality to meet the same DP requirement.
Related papers
- CURATE: Scaling-up Differentially Private Causal Graph Discovery [8.471466670802817]
Differential Privacy (DP) has been adopted to ensure user privacy in Causal Graph Discovery (CGD)
We present CURATE, a DP-CGD framework with adaptive privacy budgeting.
We show that CURATE achieves higher utility compared to existing DP-CGD algorithms with less privacy-leakage.
arXiv Detail & Related papers (2024-09-27T18:00:38Z) - Privacy-preserving Continual Federated Clustering via Adaptive Resonance
Theory [11.190614418770558]
In the clustering domain, various algorithms with a federated learning framework (i.e., federated clustering) have been actively studied.
This paper proposes a privacy-preserving continual federated clustering algorithm.
Experimental results with synthetic and real-world datasets show that the proposed algorithm has superior clustering performance.
arXiv Detail & Related papers (2023-09-07T05:45:47Z) - k-Means SubClustering: A Differentially Private Algorithm with Improved
Clustering Quality [0.0]
Many differentially private iterative algorithms have been proposed in interactive settings to protect an individual's privacy from inference attacks.
In this work, we extend the previous work on 'Differentially Private k-Means Clustering With Convergence Guarantee' by taking it as our baseline.
The novelty of our approach is to sub-cluster the clusters and then select the centroid which has a higher probability of moving in the direction of the future centroid.
arXiv Detail & Related papers (2023-01-07T17:07:12Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Differentially Private Clustering via Maximum Coverage [7.059472280274009]
We study the problem of clustering in metric spaces while preserving the privacy of individual data.
We present differential algorithms with constant multiplicative error and lower additive error.
arXiv Detail & Related papers (2020-08-27T22:11:18Z) - Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems.
Our results imply an improved algorithm for the Sample and Aggregate privacy framework.
One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.