FastLloyd: Federated, Accurate, Secure, and Tunable $k$-Means Clustering with Differential Privacy
- URL: http://arxiv.org/abs/2405.02437v1
- Date: Fri, 3 May 2024 19:04:37 GMT
- Title: FastLloyd: Federated, Accurate, Secure, and Tunable $k$-Means Clustering with Differential Privacy
- Authors: Abdulrahman Diaa, Thomas Humphries, Florian Kerschbaum,
- Abstract summary: We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting.
Existing approaches using secure computation suffer from substantial overheads and do not offer output privacy.
By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves four orders of magnitude speed-up.
- Score: 26.927356987142407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation, suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms assume a trusted central curator and do not extend to federated settings. Naively combining the secure and DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves four orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by taking advantage of constrained clustering techniques.
Related papers
- Purifying Approximate Differential Privacy with Randomized Post-processing [17.069115079029206]
We propose a framework to convert $(varepsilon, delta)$-approximate Differential Privacy (DP) mechanisms into $(varepsilon, 0)$-pure DP mechanisms.
arXiv Detail & Related papers (2025-03-27T01:10:40Z) - Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm [19.673557166734977]
Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security.
This paper investigates federated PCA and estimation for spiked covariance matrices under distributed differential privacy constraints.
We establish minimax rates of convergence, with a key finding that the central server's optimal rate is the harmonic mean of the local clients' minimax rates.
arXiv Detail & Related papers (2024-11-23T21:57:50Z) - Secure Stateful Aggregation: A Practical Protocol with Applications in Differentially-Private Federated Learning [36.42916779389165]
DP-FTRL based approaches have already seen widespread deployment in industry.
We introduce secure stateful aggregation: a simple append-only data structure that allows for the private storage of aggregate values.
We observe that secure stateful aggregation suffices for realizing DP-FTRL-based private federated learning.
arXiv Detail & Related papers (2024-10-15T07:45:18Z) - Federated Instruction Tuning of LLMs with Domain Coverage Augmentation [87.49293964617128]
Federated Domain-specific Instruction Tuning (FedDIT) utilizes limited cross-client private data together with various strategies of instruction augmentation.
We propose FedDCA, which optimize domain coverage through greedy client center selection and retrieval-based augmentation.
For client-side computational efficiency and system scalability, FedDCA$*$, the variant of FedDCA, utilizes heterogeneous encoders with server-side feature alignment.
arXiv Detail & Related papers (2024-09-30T09:34:31Z) - Private and Federated Stochastic Convex Optimization: Efficient Strategies for Centralized Systems [8.419845742978985]
This paper addresses the challenge of preserving privacy in Federated Learning (FL) within centralized systems.
We devise methods that ensure Differential Privacy (DP) while maintaining optimal convergence rates for homogeneous and heterogeneous data distributions.
arXiv Detail & Related papers (2024-07-17T08:19:58Z) - Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation [8.660393575612169]
CorDP-DME is a novel DP-DME that spans the gap between local differential privacy (LDP) and distributed DP (SecAgg)
We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters.
arXiv Detail & Related papers (2024-07-03T17:22:33Z) - Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy [47.997934291881414]
Existing mean estimation schemes are usually optimized for $L_infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L$ geometry.
We introduce a novel privacy accounting method for the sparsified Gaussian mechanism that incorporates the randomness inherent in sparsification into the DP.
Unlike previous approaches, our accounting algorithm directly operates in $L$ geometry, yielding MSEs that fast converge to those of the Gaussian mechanism.
arXiv Detail & Related papers (2024-05-02T03:48:47Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum [11.68347496182345]
Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively.
Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness.
We focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL.
arXiv Detail & Related papers (2023-06-22T00:11:53Z) - Theoretically Principled Federated Learning for Balancing Privacy and
Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters.
It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z) - Towards the Flatter Landscape and Better Generalization in Federated
Learning under Client-level Differential Privacy [67.33715954653098]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP.
Specifically, DP-FedSAM integrates Sharpness Aware of Minimization (SAM) to generate local flatness models with stability and weight robustness.
To further reduce the magnitude random noise while achieving better performance, we propose DP-FedSAM-$top_k$ by adopting the local update sparsification technique.
arXiv Detail & Related papers (2023-05-01T15:19:09Z) - Make Landscape Flatter in Differentially Private Federated Learning [69.78485792860333]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP.
Specifically, DP-FedSAM integrates local flatness models with better stability and weight robustness, which results in the small norm of local updates and robustness to DP noise.
Our algorithm achieves state-of-the-art (SOTA) performance compared with existing SOTA baselines in DPFL.
arXiv Detail & Related papers (2023-03-20T16:27:36Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Differentially Private Federated Bayesian Optimization with Distributed
Exploration [48.9049546219643]
We introduce differential privacy (DP) into the training of deep neural networks through a general framework for adding DP to iterative algorithms.
We show that DP-FTS-DE achieves high utility (competitive performance) with a strong privacy guarantee.
We also use real-world experiments to show that DP-FTS-DE induces a trade-off between privacy and utility.
arXiv Detail & Related papers (2021-10-27T04:11:06Z) - Federated Learning with Sparsification-Amplified Privacy and Adaptive
Optimization [27.243322019117144]
Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other.
We propose a new FL framework with sparsification-amplified privacy.
Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee.
arXiv Detail & Related papers (2020-08-01T20:22:57Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.