Related papers: Guided Model Merging for Hybrid Data Learning: Leveraging Centralized Data to Refine Decentralized Models

Guided Model Merging for Hybrid Data Learning: Leveraging Centralized Data to Refine Decentralized Models

URL: http://arxiv.org/abs/2503.20138v2
Date: Thu, 30 Oct 2025 17:04:50 GMT
Title: Guided Model Merging for Hybrid Data Learning: Leveraging Centralized Data to Refine Decentralized Models
Authors: Junyi Zhu, Ruicong Yao, Taha Ceritli, Savas Ozkan, Matthew B. Blaschko, Eunchung Noh, Jeongwon Min, Cho Jung Min, Mete Ozay,
Abstract summary: Current network training paradigms primarily focus on either centralized or decentralized data regimes.<n>We propose a novel framework that constructs a model atlas from decentralized models and leverages centralized data.<n>Our method synergizes federated learning (to exploit decentralized data) and model merging (to utilize centralized data), enabling effective training under hybrid data availability.
Score: 29.605620036963924
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current network training paradigms primarily focus on either centralized or decentralized data regimes. However, in practice, data availability often exhibits a hybrid nature, where both regimes coexist. This hybrid setting presents new opportunities for model training, as the two regimes offer complementary trade-offs: decentralized data is abundant but subject to heterogeneity and communication constraints, while centralized data, though limited in volume and potentially unrepresentative, enables better curation and high-throughput access. Despite its potential, effectively combining these paradigms remains challenging, and few frameworks are tailored to hybrid data regimes. To address this, we propose a novel framework that constructs a model atlas from decentralized models and leverages centralized data to refine a global model within this structured space. The refined model is then used to reinitialize the decentralized models. Our method synergizes federated learning (to exploit decentralized data) and model merging (to utilize centralized data), enabling effective training under hybrid data availability. Theoretically, we show that our approach achieves faster convergence than methods relying solely on decentralized data, due to variance reduction in the merging process. Extensive experiments demonstrate that our framework consistently outperforms purely centralized, purely decentralized, and existing hybrid-adaptable methods. Notably, our method remains robust even when the centralized and decentralized data domains differ or when decentralized data contains noise, significantly broadening its applicability.

Related papers

DFCA: Decentralized Federated Clustering Algorithm [5.898448236416388]
We introduce DFCA, a fully decentralized clustered FL algorithm that enables clients to collaboratively train cluster-specific models without central coordination.<n>DFCA uses a sequential running average to aggregate models from neighbors as updates arrive, providing a communication-efficient alternative to batch aggregation.<n>Our experiments on various datasets demonstrate that DFCA outperforms other decentralized algorithms and performs comparably to centralized IFCA.
arXiv Detail & Related papers (2025-10-17T04:17:00Z)
Efficient Federated Learning with Timely Update Dissemination [54.668309196009204]
Federated Learning (FL) has emerged as a compelling methodology for the management of distributed data.<n>We propose an efficient FL approach that capitalizes on additional downlink bandwidth resources to ensure timely update dissemination.
arXiv Detail & Related papers (2025-07-08T14:34:32Z)
Decentralized Diffusion Models [53.89995588977048]
Large-scale AI model training divides work across thousands of GPU, then synchronizes gradients across them at each step.<n>This incurs a significant network burden that only centralized, monolithic clusters can support.<n>We propose Decentralized Diffusion Models, a scalable framework for distributing diffusion model training across independent clusters.
arXiv Detail & Related papers (2025-01-09T18:59:56Z)
Protocol Learning, Decentralized Frontier Risk and the No-Off Problem [56.74434512241989]
We identify a third paradigm - Protocol Learning - where models are trained across decentralized networks of incentivized participants.<n>This approach has the potential to aggregate orders of magnitude more computational resources than any single centralized entity.<n>It also introduces novel challenges: heterogeneous and unreliable nodes, malicious participants, the need for unextractable models to preserve incentives, and complex governance dynamics.
arXiv Detail & Related papers (2024-12-10T19:53:50Z)
FedSPD: A Soft-clustering Approach for Personalized Decentralized Federated Learning [23.140777064095833]
Federated learning is a framework for distributed clients to collaboratively train a machine learning model using local data.<n>We propose FedSPD, an efficient personalized federated learning algorithm for the decentralized setting.<n>We show that FedSPD learns accurate models even in low-connectivity networks.
arXiv Detail & Related papers (2024-10-24T15:48:34Z)
Federated Clustering: An Unsupervised Cluster-Wise Training for Decentralized Data Distributions [1.6385815610837167]
Federated Cluster-Wise Refinement (FedCRef) involves clients that collaboratively train models on clusters with similar data distributions. In these groups, clients collaboratively train a shared model representing each data distribution, while continuously refining their local clusters to enhance data association accuracy. This iterative process allows our system to identify all potential data distributions across the network and develop robust representation models for each.
arXiv Detail & Related papers (2024-08-20T09:05:44Z)
Vanishing Variance Problem in Fully Decentralized Neural-Network Systems [0.8212195887472242]
Federated learning and gossip learning are emerging methodologies designed to mitigate data privacy concerns. Our research introduces a variance-corrected model averaging algorithm. Our simulation results demonstrate that our approach enables gossip learning to achieve convergence efficiency comparable to that of federated learning.
arXiv Detail & Related papers (2024-04-06T12:49:20Z)
Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training [67.67045085186797]
Almost all existing systems have to face large communication burdens if the central FL server fails. It personalizes the "right" in the deep models by alternately updating the shared and personal parameters. To further promote the shared parameters aggregation process, we propose DFed integrating the local Sharpness Miniization.
arXiv Detail & Related papers (2023-05-24T13:52:18Z)
Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling [49.87637449243698]
Traditional outsourcing requires uploading device data to the cloud server. We propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources. We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training.
arXiv Detail & Related papers (2022-10-23T00:12:18Z)
Decentralized Training of Foundation Models in Heterogeneous Environments [77.47261769795992]
Training foundation models, such as GPT-3 and PaLM, can be extremely expensive. We present the first study of training large foundation models with model parallelism in a decentralized regime over a heterogeneous network.
arXiv Detail & Related papers (2022-06-02T20:19:51Z)
Robust and Efficient Aggregation for Distributed Learning [37.203175053625245]
Distributed learning schemes based on averaging are known to be susceptible to outliers. A single malicious agent is able to drive an averaging-based distributed learning algorithm to an arbitrarily poor model. This has motivated the development of robust aggregation schemes, which are based on variations of the median and trimmed mean.
arXiv Detail & Related papers (2022-04-01T17:17:41Z)
Asynchronous Parallel Incremental Block-Coordinate Descent for Decentralized Machine Learning [55.198301429316125]
Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing. For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data. This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices.
arXiv Detail & Related papers (2022-02-07T15:04:15Z)
Data augmentation through multivariate scenario forecasting in Data Centers using Generative Adversarial Networks [0.18416014644193063]
The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center. Our research will help to optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.
arXiv Detail & Related papers (2022-01-12T15:09:10Z)
Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy. We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server. We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z)
Decentralized federated learning of deep neural networks on non-iid data [0.6335848702857039]
We tackle the non-problem of learning a personalized deep learning model in a decentralized setting. We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data detect each other and cooperate. PENS is able to achieve higher accuracies as compared to strong baselines.
arXiv Detail & Related papers (2021-07-18T19:05:44Z)
Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data. We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks. We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z)
Consensus Control for Decentralized Deep Learning [72.50487751271069]
Decentralized training of deep learning models enables on-device learning over networks, as well as efficient scaling to large compute clusters. We show in theory that when the training consensus distance is lower than a critical quantity, decentralized training converges as fast as the centralized counterpart. Our empirical insights allow the principled design of better decentralized training schemes that mitigate the performance drop.
arXiv Detail & Related papers (2021-02-09T13:58:33Z)
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge. We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
Decentralized Federated Learning via Mutual Knowledge Transfer [37.5341683644709]
Decentralized federated learning (DFL) is a problem in the Internet of things (IoT) systems. We propose a mutual knowledge transfer (Def-KT) algorithm where local clients fuse models by transferring their learnt knowledge to each other. Our experiments on the MNIST, Fashion-MNIST, and CIFAR10 datasets reveal datasets that the proposed Def-KT algorithm significantly outperforms the baseline DFL methods.
arXiv Detail & Related papers (2020-12-24T01:43:53Z)
Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning. It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.