TsmoBN: Interventional Generalization for Unseen Clients in Federated
Learning
- URL: http://arxiv.org/abs/2110.09974v1
- Date: Tue, 19 Oct 2021 13:46:37 GMT
- Title: TsmoBN: Interventional Generalization for Unseen Clients in Federated
Learning
- Authors: Meirui Jiang, Xiaofei Zhang, Michael Kamp, Xiaoxiao Li, Qi Dou
- Abstract summary: We form a training structural causal model (SCM) to explain the challenges of model generalization in a distributed learning paradigm.
We present a simple yet effective method using test-specific and momentum tracked batch normalization (TsmoBN) to generalize FL models to testing clients.
- Score: 23.519212374186232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalizing federated learning (FL) models to unseen clients with non-iid
data is a crucial topic, yet unsolved so far. In this work, we propose to
tackle this problem from a novel causal perspective. Specifically, we form a
training structural causal model (SCM) to explain the challenges of model
generalization in a distributed learning paradigm. Based on this, we present a
simple yet effective method using test-specific and momentum tracked batch
normalization (TsmoBN) to generalize FL models to testing clients. We give a
causal analysis by formulating another testing SCM and demonstrate that the key
factor in TsmoBN is the test-specific statistics (i.e., mean and variance) of
features. Such statistics can be seen as a surrogate variable for causal
intervention. In addition, by considering generalization bounds in FL, we show
that our TsmoBN method can reduce divergence between training and testing
feature distributions, which achieves a lower generalization gap than standard
model testing. Our extensive experimental evaluations demonstrate significant
improvements for unseen client generalization on three datasets with various
types of feature distributions and numbers of clients. It is worth noting that
our proposed approach can be flexibly applied to different state-of-the-art
federated learning algorithms and is orthogonal to existing domain
generalization methods.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - Every Parameter Matters: Ensuring the Convergence of Federated Learning
with Dynamic Heterogeneous Models Reduction [22.567754688492414]
Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks.
Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly.
This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time.
arXiv Detail & Related papers (2023-10-12T19:07:58Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Exploiting Personalized Invariance for Better Out-of-distribution
Generalization in Federated Learning [13.246981646250518]
This paper presents a general dual-regularized learning framework to explore the personalized invariance, compared with the exsiting personalized federated learning methods.
We show that our method is superior over the existing federated learning and invariant learning methods, in diverse out-of-distribution and Non-IID data cases.
arXiv Detail & Related papers (2022-11-21T08:17:03Z) - FedGen: Generalizable Federated Learning for Sequential Data [8.784435748969806]
In many real-world distributed settings, spurious correlations exist due to biases and data sampling issues.
We present a generalizable federated learning framework called FedGen, which allows clients to identify and distinguish between spurious and invariant features.
We show that FedGen results in models that achieve significantly better generalization and can outperform the accuracy of current federated learning approaches by over 24%.
arXiv Detail & Related papers (2022-11-03T15:48:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - The Conditional Entropy Bottleneck [8.797368310561058]
We characterize failures of robust generalization as failures of accuracy or related metrics on a held-out set.
We propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model.
In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB)
We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets.
arXiv Detail & Related papers (2020-02-13T07:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.