Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
- URL: http://arxiv.org/abs/2204.11017v1
- Date: Sat, 23 Apr 2022 08:23:00 GMT
- Title: Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
- Authors: Federico Lucchetti, J\'er\'emie Decouchant, Maria Fernandes, Lydia Y.
Chen, Marcus V\"olp
- Abstract summary: Federated learning allows clients to collaboratively train models on datasets that cannot be exchanged because of their size or regulations.
Previous works tried to mitigate the effects of non-IID datasets on training accuracy, focusing mainly on non-IID labels.
We propose FedGMCC, a novel framework where a central server aggregates client models that it can cluster together.
- Score: 5.265938474748481
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning allows clients to collaboratively train models on datasets
that are acquired in different locations and that cannot be exchanged because
of their size or regulations. Such collected data is increasingly
non-independent and non-identically distributed (non-IID), negatively affecting
training accuracy. Previous works tried to mitigate the effects of non-IID
datasets on training accuracy, focusing mainly on non-IID labels, however
practical datasets often also contain non-IID features. To address both non-IID
labels and features, we propose FedGMCC, a novel framework where a central
server aggregates client models that it can cluster together. FedGMCC
clustering relies on a Monte Carlo procedure that samples the output space of
client models, infers their position in the weight space on a loss manifold and
computes their geometric connection via an affine curve parametrization.
FedGMCC aggregates connected models along their path connectivity to produce a
richer global model, incorporating knowledge of all connected client models.
FedGMCC outperforms FedAvg and FedProx in terms of convergence rates on the
EMNIST62 and a genomic sequence classification datasets (by up to +63%).
FedGMCC yields an improved accuracy (+4%) on the genomic dataset with respect
to CFL, in high non-IID feature space settings and label incongruency.
Related papers
- FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments [11.797290397638962]
Federated Learning (FL) enables a group of clients to collaboratively train a model without sharing individual data, but its performance drops when client data are heterogeneous.<n>This paper introduces a clustered FL framework, FedDAG, that employs a weighted, class-wise similarity metric that integrates both data and information.<n> Experiments on diverse benchmarks and data settings show that FedDAG consistently outperforms state-of-the-art clustered FL baselines in accuracy.
arXiv Detail & Related papers (2026-02-26T21:20:19Z) - Variational Gaussian Mixture Manifold Models for Client-Specific Federated Personalization [0.0]
VGM$2$ is a geometry-centric PFL framework that learns client-specific parametric UMAP embeddings.<n>Each client maintains a Dirichlet-Normal-Inverse-Gamma posterior over marker weights, means, and variances.<n>VGM$2$ achieves competitive or superior test F1 scores compared to strong baselines.
arXiv Detail & Related papers (2025-09-04T01:28:02Z) - Federated Gaussian Mixture Models [0.0]
FedGenGMM is a novel one-shot federated learning approach for unsupervised learning scenarios.<n>It allows local GMM models, trained independently on client devices, to be aggregated through a single communication round.<n>It consistently achieves performance comparable to non-federated and iterative federated methods.
arXiv Detail & Related papers (2025-06-02T15:23:53Z) - FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning.
Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning.
We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z) - Knowledge-Driven Federated Graph Learning on Model Heterogeneity [47.98634086448171]
Federated graph learning (FGL) has emerged as a promising paradigm for collaborative graph representation learning.<n>We propose the Federated Graph Knowledge Collaboration (FedGKC) framework to address the challenge of model-centric heterogeneous FGL.<n>FedGKC achieves an average accuracy gain of 3.74% over baselines in MHtFGL scenarios, while maintaining excellent performance in homogeneous settings.
arXiv Detail & Related papers (2025-01-22T04:12:32Z) - Learning Unlabeled Clients Divergence for Federated Semi-Supervised Learning via Anchor Model Aggregation [10.282711631100845]
SemiAnAgg learns unlabeled client contributions via an anchor model.
SemiAnAgg achieves new state-of-the-art results on four widely used FedSemi benchmarks.
arXiv Detail & Related papers (2024-07-14T20:50:40Z) - FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm.
One of the major challenges in FL is the presence of uneven data distributions across client devices.
We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z) - Federated Contrastive Learning for Personalized Semantic Communication [55.46383524190467]
We design a federated contrastive learning framework aimed at supporting personalized semantic communication.
FedCL enables collaborative training of local semantic encoders across multiple clients and a global semantic decoder owned by the base station.
To tackle the semantic imbalance issue arising from heterogeneous datasets across distributed clients, we employ contrastive learning to train a semantic centroid generator.
arXiv Detail & Related papers (2024-06-13T14:45:35Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Joint Local Relational Augmentation and Global Nash Equilibrium for
Federated Learning with Non-IID Data [36.426794300280854]
Federated learning (FL) is a distributed machine learning paradigm that needs collaboration between a server and a series of clients with decentralized data.
We propose FedRANE, which consists of two main modules, local relational augmentation (LRA) and global Nash equilibrium (GNE) to resolve intra- and inter-client inconsistency simultaneously.
arXiv Detail & Related papers (2023-08-17T06:17:51Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with
Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data.
We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets.
We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for
Non-IID Data in Federated Learning [4.02923738318937]
Uneven distribution of local data across different edge devices (clients) results in slow model training and accuracy reduction in federated learning.
This work introduces a novel non-IID type encountered in real-world datasets, namely cluster-skew.
We propose FedDRL, a novel FL model that employs deep reinforcement learning to adaptively determine each client's impact factor.
arXiv Detail & Related papers (2022-08-04T04:24:16Z) - FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data.
We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC)
Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters.
Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z) - Toward Understanding the Influence of Individual Clients in Federated
Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server.
We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.