Information-Geometric Barycenters for Bayesian Federated Learning
- URL: http://arxiv.org/abs/2412.11646v2
- Date: Wed, 07 May 2025 11:54:19 GMT
- Title: Information-Geometric Barycenters for Bayesian Federated Learning
- Authors: Nour Jamoussi, Giuseppe Serra, Photios A. Stavrou, Marios Kountouris,
- Abstract summary: Federated learning (FL) is used to achieve consensus through averaging locally trained models.<n>While effective, this approach may not align well with Bayesian inference, where the model space has the structure of a distribution space.<n>We propose BA-FLB, an algorithm that retains convergence properties of Federated Averaging in nonindependent settings.
- Score: 9.670266892454945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL) is a widely used and impactful distributed optimization framework that achieves consensus through averaging locally trained models. While effective, this approach may not align well with Bayesian inference, where the model space has the structure of a distribution space. Taking an information-geometric perspective, we reinterpret FL aggregation as the problem of finding the barycenter of local posteriors using a prespecified divergence metric, minimizing the average discrepancy across clients. This perspective provides a unifying framework that generalizes many existing methods and offers crisp insights into their theoretical underpinnings. We then propose BA-BFL, an algorithm that retains the convergence properties of Federated Averaging in non-convex settings. In non-independent and identically distributed scenarios, we conduct extensive comparisons with statistical aggregation techniques, showing that BA-BFL achieves performance comparable to state-of-the-art methods while offering a geometric interpretation of the aggregation phase. Additionally, we extend our analysis to Hybrid Bayesian Deep Learning, exploring the impact of Bayesian layers on uncertainty quantification and model calibration.
Related papers
- Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation [7.3170276716290354]
FedWBA is a novel PBFL method that enhances both local inference and global aggregation.<n>We provide local and global convergence guarantees for FedWBA.<n>Experiments show that FedWBA outperforms baselines in prediction accuracy, uncertainty calibration, and convergence rate.
arXiv Detail & Related papers (2025-05-20T10:14:32Z) - Robust and Scalable Variational Bayes [2.014089835498735]
We propose a robust framework for variational Bayes (VB) that effectively handles outliers and contamination of arbitrary nature in large datasets.
Our approach divides the dataset into disjoint subsets, computes the posterior for each subset, and applies VB approximation independently to these posteriors.
This novel aggregation method yields the Variational Median Posterior (VM-Posterior) distribution.
arXiv Detail & Related papers (2025-04-16T23:20:43Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy.<n>We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution.<n>Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z) - On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians [24.473522267391072]
We develop algorithms on Bures-Wasserstein manifold, named the Exact Geodesic Gradient Descent and Hybrid Gradient Descent algorithms.
We establish theoretical convergence guarantees for both methods and demonstrate that the Exact Geodesic Gradient Descent algorithm attains a dimension-free convergence rate.
arXiv Detail & Related papers (2024-10-10T17:01:57Z) - ScoreFusion: fusing score-based generative models via Kullback-Leibler barycenters [8.08976346461518]
We introduce ScoreFusion, a theoretically grounded method for fusing multiple pre-trained diffusion models.
Our starting point considers the family of KL barycenters of the auxiliary populations, which is proven to be an optimal parametric class in the KL sense.
By recasting the learning problem as score matching in denoising diffusion, we obtain a tractable way of computing the optimal KL barycenter weights.
arXiv Detail & Related papers (2024-06-28T03:02:25Z) - Bridging Data Barriers among Participants: Assessing the Potential of Geoenergy through Federated Learning [2.8498944632323755]
This study introduces a novel federated learning (FL) framework based on XGBoost models.
FL models demonstrate superior accuracy and generalization capabilities compared to separate models.
This study opens new avenues for assessing unconventional reservoirs through collaborative and privacy-preserving FL techniques.
arXiv Detail & Related papers (2024-04-29T09:12:31Z) - Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models [0.9940108090221528]
Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models.
We show that simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance.
arXiv Detail & Related papers (2024-03-22T15:02:24Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.<n>We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.<n>Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z) - Bayesian Federated Inference for regression models based on non-shared multicenter data sets from heterogeneous populations [0.0]
In a regression model, the sample size must be large enough relative to the number of possible predictors.
Pooling data from different data sets collected in different (medical) centers would alleviate this problem, but is often not feasible due to privacy regulation or logistic problems.
An alternative route would be to analyze the local data in the centers separately and combine the statistical inference results with the Bayesian Federated Inference (BFI) methodology.
The aim of this approach is to compute from the inference results in separate centers what would have been found if the statistical analysis was performed on the combined data.
arXiv Detail & Related papers (2024-02-05T11:10:27Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - Improving Federated Aggregation with Deep Unfolding Networks [19.836640510604422]
Federated learning (FL) is negatively affected by device differences and statistical characteristics between participating clients.
We introduce a deep unfolding network (DUN)-based technique that learns adaptive weights that unbiasedly ameliorate the adverse impacts of heterogeneity.
The proposed method demonstrates impressive accuracy and quality-aware aggregation.
arXiv Detail & Related papers (2023-06-30T01:51:22Z) - FedHB: Hierarchical Bayesian Federated Learning [11.936836827864095]
We propose a novel hierarchical Bayesian approach to Federated Learning (FL)
Our model reasonably describes the generative process of clients' local data via hierarchical Bayesian modeling.
We show that our block-coordinate FL algorithm converges to an optimum of the objective at the rate of $O(sqrtt)$.
arXiv Detail & Related papers (2023-05-08T18:21:41Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - GELATO: Geometrically Enriched Latent Model for Offline Reinforcement
Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods.
In this work, we demonstrate the benefit of combining the two in a latent variational model.
Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Bayesian data-driven discovery of partial differential equations with variable coefficients [9.331440154110117]
We propose an advanced Bayesian sparse learning algorithm for PDE discovery with variable coefficients.
In the experiments, we show that the tBGL-SS method is more robust than the baseline methods under noisy environments.
arXiv Detail & Related papers (2021-02-02T11:05:34Z) - Continuous Regularized Wasserstein Barycenters [51.620781112674024]
We introduce a new dual formulation for the regularized Wasserstein barycenter problem.
We establish strong duality and use the corresponding primal-dual relationship to parametrize the barycenter implicitly using the dual potentials of regularized transport problems.
arXiv Detail & Related papers (2020-08-28T08:28:06Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Disentangled Representation Learning with Wasserstein Total Correlation [90.44329632061076]
We introduce Wasserstein total correlation in both variational autoencoder and Wasserstein autoencoder settings to learn disentangled latent representations.
A critic is adversarially trained along with the main objective to estimate the Wasserstein total correlation term.
We show that the proposed approach has comparable performances on disentanglement with smaller sacrifices in reconstruction abilities.
arXiv Detail & Related papers (2019-12-30T05:31:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.