Related papers: Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

URL: http://arxiv.org/abs/2308.11464v2
Date: Mon, 26 Feb 2024 09:48:00 GMT
Title: Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning
Authors: Yun-Hin Chan, Rui Zhou, Running Zhao, Zhihan Jiang, Edith C.-H. Ngai
Abstract summary: Federated learning (FL) inevitably confronts the challenge of system heterogeneity. We propose a training scheme that can extend the capabilities of most model-homogeneous FL methods.
Score: 11.29694276480432
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) inevitably confronts the challenge of system heterogeneity in practical scenarios. To enhance the capabilities of most model-homogeneous FL methods in handling system heterogeneity, we propose a training scheme that can extend their capabilities to cope with this challenge. In this paper, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover three key observations: (1) a positive correlation between client performance and layer similarities, (2) higher similarities in the shallow layers in contrast to the deep layers, and (3) the smoother gradients distributions indicate the higher layer similarities. Building upon these observations, we propose InCo Aggregation that leverages internal cross-layer gradients, a mixture of gradients from shallow and deep layers within a server model, to augment the similarity in the deep layers without requiring additional communication between clients. Furthermore, our methods can be tailored to accommodate model-homogeneous FL methods such as FedAvg, FedProx, FedNova, Scaffold, and MOON, to expand their capabilities to handle the system heterogeneity. Copious experimental results validate the effectiveness of InCo Aggregation, spotlighting internal cross-layer gradients as a promising avenue to enhance the performance in heterogeneous FL.

Related papers

FedSKD: Aggregation-free Model-heterogeneous Federated Learning using Multi-dimensional Similarity Knowledge Distillation [7.944298319589845]
Federated learning (FL) enables privacy-preserving collaborative model training without direct data sharing. Model-heterogeneous FL (MHFL) allows clients to train personalized models with heterogeneous architectures tailored to their computational resources and application-specific needs. While peer-to-peer (P2P) FL removes server dependence, it suffers from model drift and knowledge dilution, limiting its effectiveness in heterogeneous settings. We propose FedSKD, a novel MHFL framework that facilitates direct knowledge exchange through round-robin model circulation.
arXiv Detail & Related papers (2025-03-23T05:33:10Z)
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs [52.55025869932486]
This paper introduces a promising alternative method for training Generative Adversarial Networks (GANs) on large-scale datasets with clear theoretical guarantees. We propose a novel Lipschitz-constrained Functional Gradient GANs learning (Li-CFG) method to stabilize the training of GAN. We demonstrate that the neighborhood size of the latent vector can be reduced by increasing the norm of the discriminator gradient.
arXiv Detail & Related papers (2025-01-20T02:48:07Z)
Towards Layer-Wise Personalized Federated Learning: Adaptive Layer Disentanglement via Conflicting Gradients [11.269920973751244]
In personalized Federated Learning (pFL), high data heterogeneity can cause significant gradient divergence across devices. We introduce a new approach to pFL design, namely Federated Learning with Layer-wise Aggregation via Gradient Analysis (FedLAG) FedLAG assigns layers for personalization based on the extent of layer-wise gradient conflicts.
arXiv Detail & Related papers (2024-10-03T14:46:19Z)
On Layer-wise Representation Similarity: Application for Multi-Exit Models with a Single Classifier [20.17288970927518]
We study the similarity of representations between the hidden layers of individual transformers. We propose an aligned training approach to enhance the similarity between internal representations.
arXiv Detail & Related papers (2024-06-20T16:41:09Z)
Learning in PINNs: Phase transition, total diffusion, and generalization [1.8802875123957965]
We investigate the learning dynamics of fully-connected neural networks through the lens of gradient signal-to-noise ratio (SNR) We identify a third phase termed total diffusion" We explore the information-induced compression phenomenon, pinpointing a significant compression of activations at the total diffusion phase.
arXiv Detail & Related papers (2024-03-27T12:10:30Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Tackling the Non-IID Issue in Heterogeneous Federated Learning by Gradient Harmonization [11.484136481586381]
Federated learning (FL) is a privacy-preserving paradigm for collaboratively training a global model from decentralized clients. In this work, we revisit this key challenge through the lens of gradient conflicts on the server side. We propose FedGH, a simple yet effective method that mitigates local drifts through Gradient Harmonization.
arXiv Detail & Related papers (2023-09-13T03:27:21Z)
GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy. Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge. We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z)
Deep Equilibrium Models Meet Federated Learning [71.57324258813675]
This study explores the problem of Federated Learning (FL) by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks. We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL. To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning.
arXiv Detail & Related papers (2023-05-29T22:51:40Z)
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization. We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer. We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z)
GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes. It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes. We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z)
Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization [107.72786199113183]
Federated learning (FL) provides a distributed learning framework for multiple participants to collaborate learning without sharing raw data. In this paper, we propose a novel Split-Mix FL strategy for heterogeneous participants that, once training is done, provides in-situ customization of model sizes and robustness.
arXiv Detail & Related papers (2022-03-18T04:58:34Z)
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training [1.3999481573773072]
We study the stability of these layers across runs and model sizes. We propose that group normalization may be used without disrupting their formation. We apply these findings to Federated Learning in order to improve the training procedure.
arXiv Detail & Related papers (2021-10-08T17:25:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.