Generalization in Federated Learning: A Conditional Mutual Information Framework
- URL: http://arxiv.org/abs/2503.04091v1
- Date: Thu, 06 Mar 2025 04:57:51 GMT
- Title: Generalization in Federated Learning: A Conditional Mutual Information Framework
- Authors: Ziqiao Wang, Cheng Long, Yongyi Mao,
- Abstract summary: Federated Learning (FL) is a widely adopted privacy-preserving distributed learning framework.<n>We apply an information-theoretic analysis via the conditional mutual information (CMI) framework to study FL's two-level generalization.<n>We derive multiple CMI-based bounds, including hypothesis-based CMI bounds, illustrating how privacy constraints in FL can imply generalization guarantees.
- Score: 45.657352088035516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a widely adopted privacy-preserving distributed learning framework, yet its generalization performance remains less explored compared to centralized learning. In FL, the generalization error consists of two components: the out-of-sample gap, which measures the gap between the empirical and true risk for participating clients, and the participation gap, which quantifies the risk difference between participating and non-participating clients. In this work, we apply an information-theoretic analysis via the conditional mutual information (CMI) framework to study FL's two-level generalization. Beyond the traditional supersample-based CMI framework, we introduce a superclient construction to accommodate the two-level generalization setting in FL. We derive multiple CMI-based bounds, including hypothesis-based CMI bounds, illustrating how privacy constraints in FL can imply generalization guarantees. Furthermore, we propose fast-rate evaluated CMI bounds that recover the best-known convergence rate for two-level FL generalization in the small empirical risk regime. For specific FL model aggregation strategies and structured loss functions, we refine our bounds to achieve improved convergence rates with respect to the number of participating clients. Empirical evaluations confirm that our evaluated CMI bounds are non-vacuous and accurately capture the generalization behavior of FL algorithms.
Related papers
- Every Parameter Matters: Ensuring the Convergence of Federated Learning
with Dynamic Heterogeneous Models Reduction [22.567754688492414]
Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks.
Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly.
This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time.
arXiv Detail & Related papers (2023-10-12T19:07:58Z) - Learner Referral for Cost-Effective Federated Learning Over Hierarchical
IoT Networks [21.76836812021954]
This paper aided federated selection (LRef-FedCS), communications resource, and local model accuracy (LMAO) methods.
Our proposed LRef-FedCS approach could achieve a good balance between high global accuracy and reducing cost.
arXiv Detail & Related papers (2023-07-19T13:33:43Z) - Federated Learning under Covariate Shifts with Generalization Guarantees [46.56040078380132]
We formulate a new global model training paradigm and propose Federated Importance-Weighted Empirical Risk Minimization (FTW-ERM)
We show that FTW-ERM achieves smaller generalization error than classical ERM under certain settings.
arXiv Detail & Related papers (2023-06-08T16:18:08Z) - Feature Correlation-guided Knowledge Transfer for Federated
Self-supervised Learning [19.505644178449046]
We propose a novel and general method named Federated Self-supervised Learning with Feature-correlation based Aggregation (FedFoA)
Our insight is to utilize feature correlation to align the feature mappings and calibrate the local model updates across clients during their local training process.
We prove that FedFoA is a model-agnostic training framework and can be easily compatible with state-of-the-art unsupervised FL methods.
arXiv Detail & Related papers (2022-11-14T13:59:50Z) - On Leave-One-Out Conditional Mutual Information For Generalization [122.2734338600665]
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI)
Contrary to other CMI bounds, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation.
We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning.
arXiv Detail & Related papers (2022-07-01T17:58:29Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Generalized Federated Learning via Sharpness Aware Minimization [22.294290071999736]
We propose a general, effective algorithm, textttFedSAM, based on Sharpness Aware Minimization (SAM) local, and develop a momentum FL algorithm to bridge local and global models.
Empirically, our proposed algorithms substantially outperform existing FL studies and significantly decrease the learning deviation.
arXiv Detail & Related papers (2022-06-06T13:54:41Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Tight Mutual Information Estimation With Contrastive Fenchel-Legendre
Optimization [69.07420650261649]
We introduce a novel, simple, and powerful contrastive MI estimator named as FLO.
Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently.
The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
arXiv Detail & Related papers (2021-07-02T15:20:41Z) - FedCM: Federated Learning with Client-level Momentum [18.722626360599065]
Federated Averaging with Client-level Momentum (FedCM) is proposed to tackle problems of partial participation and client heterogeneity in real-world federated learning applications.
FedCM aggregates global gradient information in previous communication rounds and modifies client gradient descent with a momentum-like term, which can effectively correct the bias and improve the stability of local SGD.
arXiv Detail & Related papers (2021-06-21T06:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.