Related papers: Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

URL: http://arxiv.org/abs/2403.02360v1
Date: Mon, 4 Mar 2024 05:10:28 GMT
Title: Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling
Authors: Xingyan Chen and Tian Du and Mu Wang and Tiancheng Gu and Yu Zhao and Gang Kou and Changqiao Xu and Dapeng Oliver Wu
Abstract summary: Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting. We propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning. Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal.
Score: 20.593232086762665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting. However, the heterogeneity of edge data distribution drags the model towards the local minima, which can be distant from the global optimum. Such heterogeneity often leads to slow convergence and substantial communication overhead. To address these issues, we propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning that separates deep neural networks into a body for capturing shared representations in Cloud and a personalized head for migrating data heterogeneity. Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal. Instead, it is necessary to dynamically select the personalized layer that maximizes the training performance by taking the representation difference between neighbor layers into account. To find the optimal personalized layer, we utilize the low-dimensional representation of each layer to contrast feature distribution transfer and introduce a Wasserstein-based layer selection method, aimed at identifying the best-match layer for personalization. Additionally, a weighted global aggregation algorithm is proposed based on the selected personalized layer for the practical application of FedCMD. Extensive experiments on ten benchmarks demonstrate the efficiency and superior performance of our solution compared with nine state-of-the-art solutions. All code and results are available at https://github.com/elegy112138/FedCMD.

Related papers

FedDuA: Doubly Adaptive Federated Learning [2.6108066206600555]
Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data.<n>We formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA.<n>We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives.
arXiv Detail & Related papers (2025-05-16T11:15:27Z)
Exploring Selective Layer Fine-Tuning in Federated Learning [48.470385357429215]
Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data. We study selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources.
arXiv Detail & Related papers (2024-08-28T07:48:39Z)
Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models. It is most prominently used in federated learning. We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z)
Learning the Right Layers: a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs [2.752817022620644]
Clustering (or community detection) on multilayer graphs poses several additional complications. One of the major challenges is to establish the extent to which each layer contributes to the cluster iteration assignment. We propose a parameter-free Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels.
arXiv Detail & Related papers (2023-05-31T19:50:11Z)
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization. We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer. We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z)
Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment [17.735593218773758]
We propose a novel model fusion framework, named CLAFusion, to fuse neural networks with a different number of layers. Based on the cross-layer alignment, our framework balances the number of layers of neural networks before applying layer-wise model fusion.
arXiv Detail & Related papers (2021-10-29T05:02:23Z)
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge. We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
Analysis and Optimal Edge Assignment For Hierarchical Federated Learning on Non-IID Data [43.32085029569374]
Federated learning algorithms aim to leverage distributed and diverse data stored at users' devices to learn a global phenomena. In the cases where the participants' data are strongly skewed (i.e., non-IID), the local models can overfit local data, leading to low performing global model. We propose a hierarchical learning system that performs Federated Gradient Descent on the user-edge layer and Federated Averaging on the edge-cloud layer.
arXiv Detail & Related papers (2020-12-10T12:18:13Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
Neural Inheritance Relation Guided One-Shot Layer Assignment Search [44.82474044430184]
We investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100. We find a neural inheritance relation among the networks with different layer assignments, that is, the optimal layer assignments for deeper networks always inherit from those for shallow networks. Inspired by this neural inheritance relation, we propose an efficient one-shot layer assignment search approach via inherited sampling.
arXiv Detail & Related papers (2020-02-28T07:40:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.