Related papers: Differentially Private Federated Clustering over Non-IID Data

Differentially Private Federated Clustering over Non-IID Data

URL: http://arxiv.org/abs/2301.00955v3
Date: Mon, 30 Oct 2023 14:59:23 GMT
Title: Differentially Private Federated Clustering over Non-IID Data
Authors: Yiwei Li, Shuai Wang, Chong-Yung Chi, Tony Q. S. Quek
Abstract summary: clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server. We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered. Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
Score: 59.611244450530315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we investigate federated clustering (FedC) problem, that aims to accurately partition unlabeled data samples distributed over massive clients into finite clusters under the orchestration of a parameter server, meanwhile considering data privacy. Though it is an NP-hard optimization problem involving real variables denoting cluster centroids and binary variables denoting the cluster membership of each data sample, we judiciously reformulate the FedC problem into a non-convex optimization problem with only one convex constraint, accordingly yielding a soft clustering solution. Then a novel FedC algorithm using differential privacy (DP) technique, referred to as DP-FedC, is proposed in which partial clients participation and multiple local model updating steps are also considered. Furthermore, various attributes of the proposed DP-FedC are obtained through theoretical analyses of privacy protection and convergence rate, especially for the case of non-identically and independently distributed (non-i.i.d.) data, that ideally serve as the guidelines for the design of the proposed DP-FedC. Then some experimental results on two real datasets are provided to demonstrate the efficacy of the proposed DP-FedC together with its much superior performance over some state-of-the-art FedC algorithms, and the consistency with all the presented analytical results.

Related papers

Privacy-preserving Quantification of Non-IID Degree in Federated Learning [22.194684042923406]
Federated learning (FL) offers a privacy-preserving approach to machine learning for multiple collaborators without sharing raw data. The existence of non-independent and non-identically distributed (non-IID) datasets across different clients presents a significant challenge to FL. This paper proposes a quantitative definition of the non-IID degree in the federated environment by employing the cumulative distribution function.
arXiv Detail & Related papers (2024-06-14T03:08:53Z)
Differentially Private Clustered Federated Learning [4.768272342753616]
Federated learning (FL) often incorporates differential privacy (DP) to provide rigorous data privacy guarantees. Previous works attempted to address high structured data heterogeneity in vanilla FL settings through clustering clients (a.k.a clustered FL) We propose an algorithm for differentially private clustered FL, which is robust to the DP noise in the system and identifies the underlying clients' clusters correctly.
arXiv Detail & Related papers (2024-05-29T17:03:31Z)
Rethinking Clustered Federated Learning in NOMA Enhanced Wireless Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets. A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented. Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z)
Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS) In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems. Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z)
Federated Two Stage Decoupling With Adaptive Personalization Layers [5.69361786082969]
Federated learning has gained significant attention due to its ability to enable distributed learning while maintaining privacy constraints. It inherently experiences significant learning degradation and slow convergence speed. It is natural to employ the concept of clustering homogeneous clients into the same group, allowing only the model weights within each group to be aggregated.
arXiv Detail & Related papers (2023-08-30T07:46:32Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
Personalized Graph Federated Learning with Differential Privacy [6.282767337715445]
This paper presents a personalized graph federated learning (PGFL) framework in which distributedly connected servers and their respective edge devices collaboratively learn device or cluster-specific models. We study a variant of the PGFL implementation that utilizes differential privacy, specifically zero-concentrated differential privacy, where a noise sequence perturbs model exchanges. Our analysis shows that the algorithm ensures local differential privacy for all clients in terms of zero-concentrated differential privacy.
arXiv Detail & Related papers (2023-06-10T09:52:01Z)
CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data. We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets. We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z)
A One-shot Framework for Distributed Clustered Learning in Heterogeneous Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments. One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees. For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z)
Sample-based and Feature-based Federated Learning via Mini-batch SSCA [18.11773963976481]
This paper investigates sample-based and feature-based federated optimization. We show that the proposed algorithms can preserve data privacy through the model aggregation mechanism. We also show that the proposed algorithms converge to Karush-Kuhn-Tucker points of the respective federated optimization problems.
arXiv Detail & Related papers (2021-04-13T08:23:46Z)
A Unified Linear Speedup Analysis of Federated Averaging and Nesterov FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data. In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today. We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.