Related papers: k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

URL: http://arxiv.org/abs/2301.02896v1
Date: Sat, 7 Jan 2023 17:07:12 GMT
Title: k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality
Authors: Devvrat Joshi, Janvi Thakkar
Abstract summary: Many differentially private iterative algorithms have been proposed in interactive settings to protect an individual's privacy from inference attacks. In this work, we extend the previous work on 'Differentially Private k-Means Clustering With Convergence Guarantee' by taking it as our baseline. The novelty of our approach is to sub-cluster the clusters and then select the centroid which has a higher probability of moving in the direction of the future centroid.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In today's data-driven world, the sensitivity of information has been a significant concern. With this data and additional information on the person's background, one can easily infer an individual's private data. Many differentially private iterative algorithms have been proposed in interactive settings to protect an individual's privacy from these inference attacks. The existing approaches adapt the method to compute differentially private(DP) centroids by iterative Llyod's algorithm and perturbing the centroid with various DP mechanisms. These DP mechanisms do not guarantee convergence of differentially private iterative algorithms and degrade the quality of the cluster. Thus, in this work, we further extend the previous work on 'Differentially Private k-Means Clustering With Convergence Guarantee' by taking it as our baseline. The novelty of our approach is to sub-cluster the clusters and then select the centroid which has a higher probability of moving in the direction of the future centroid. At every Lloyd's step, the centroids are injected with the noise using the exponential DP mechanism. The results of the experiments indicate that our approach outperforms the current state-of-the-art method, i.e., the baseline algorithm, in terms of clustering quality while maintaining the same differential privacy requirements. The clustering quality significantly improved by 4.13 and 2.83 times than baseline for the Wine and Breast_Cancer dataset, respectively.

Related papers

Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Differentially Private Random Block Coordinate Descent [51.62669821275571]
We propose a differentially private random coordinate descent method that selects multiple coordinates with varying probabilities in each iteration using sketch matrices. Our algorithm generalizes both DP-CD and the classical DP-SGD (Differentially Private Descent), while preserving the same utility guarantees.
arXiv Detail & Related papers (2024-12-22T15:06:56Z)
Differentially Private Clustered Federated Learning [4.768272342753616]
Federated learning (FL) often incorporates differential privacy (DP) to provide rigorous data privacy guarantees. Previous works attempted to address high structured data heterogeneity in vanilla FL settings through clustering clients (a.k.a clustered FL) We propose an algorithm for differentially private clustered FL, which is robust to the DP noise in the system and identifies the underlying clients' clusters correctly.
arXiv Detail & Related papers (2024-05-29T17:03:31Z)
Fuzzy K-Means Clustering without Cluster Centroids [21.256564324236333]
Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. This paper proposes a novel Fuzzy textitK-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
arXiv Detail & Related papers (2024-04-07T12:25:03Z)
On the privacy of federated Clustering: A Cryptographic View [2.209921757303168]
Many privacy-preserving clustering algorithms leverage cryptographic techniques like homomorphic encryption or secure multiparty computation to guarantee full privacy. This paper delves into this intricate trade-off, questioning the necessity of continuous encryption in iterative algorithms. We show that existing lattice-based HSSP attacks fail in reconstructing the private data given the knowledge of intermediate centroids, thus it is secure to reveal them for the sake of efficiency.
arXiv Detail & Related papers (2023-12-13T09:04:14Z)
Privacy-preserving Continual Federated Clustering via Adaptive Resonance Theory [11.190614418770558]
In the clustering domain, various algorithms with a federated learning framework (i.e., federated clustering) have been actively studied. This paper proposes a privacy-preserving continual federated clustering algorithm. Experimental results with synthetic and real-world datasets show that the proposed algorithm has superior clustering performance.
arXiv Detail & Related papers (2023-09-07T05:45:47Z)
Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server. We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered. Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z)
Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing. Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z)
Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality. The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z)
Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms. DPPs favor diversity of the center points within subsets. We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z)
Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data. perturbations chosen independently at every agent, resulting in a significant performance loss. We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
Differentially Private Clustering via Maximum Coverage [7.059472280274009]
We study the problem of clustering in metric spaces while preserving the privacy of individual data. We present differential algorithms with constant multiplicative error and lower additive error.
arXiv Detail & Related papers (2020-08-27T22:11:18Z)
Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems. Our results imply an improved algorithm for the Sample and Aggregate privacy framework. One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)
Differentially Private k-Means Clustering with Guaranteed Convergence [5.335316436366718]
Iterative clustering algorithms help us to learn the insights behind the data. It may allow adversaries to infer the privacy of individuals with some background knowledge. To protect individual privacy against such an inference attack, preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied.
arXiv Detail & Related papers (2020-02-03T22:53:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.