Data Assetization via Resources-decoupled Federated Learning
- URL: http://arxiv.org/abs/2501.14588v2
- Date: Tue, 11 Feb 2025 09:03:49 GMT
- Title: Data Assetization via Resources-decoupled Federated Learning
- Authors: Jianzhe Zhao, Feida Zhu, Lingyan He, Zixin Tang, Mingce Gao, Shiyu Yang, Guibing Guo,
- Abstract summary: Federated learning (FL) provides an effective approach to collaborative training models while preserving privacy.
We first propose a framework for resource-decoupled FL involving three parties.
Next, we propose the Quality-aware Dynamic Resources-decoupled FL algorithm (QD-RDFL)
- Score: 7.347554648348435
- License:
- Abstract: With the development of the digital economy, data is increasingly recognized as an essential resource for both work and life. However, due to privacy concerns, data owners tend to maximize the value of data through the circulation of information rather than direct data transfer. Federated learning (FL) provides an effective approach to collaborative training models while preserving privacy. However, as model parameters and training data grow, there are not only real differences in data resources between different data owners, but also mismatches between data and computing resources. These challenges lead to inadequate collaboration among data owners, compute centers, and model owners, reducing the global utility of the three parties and the effectiveness of data assetization. In this work, we first propose a framework for resource-decoupled FL involving three parties. Then, we design a Tripartite Stackelberg Model and theoretically analyze the Stackelberg-Nash equilibrium (SNE) for participants to optimize global utility. Next, we propose the Quality-aware Dynamic Resources-decoupled FL algorithm (QD-RDFL), in which we derive and solve the optimal strategies of all parties to achieve SNE using backward induction. We also design a dynamic optimization mechanism to improve the optimal strategy profile by evaluating the contribution of data quality from data owners to the global model during real training. Finally, our extensive experiments demonstrate that our method effectively encourages the linkage of the three parties involved, maximizing the global utility and value of data assets.
Related papers
- FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning [12.307490659840845]
Federated Learning (FL) combines locally optimized models from various clients into a unified global model.
FL encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model.
We introduce an innovative dual-strategy approach designed to effectively resolve these issues.
arXiv Detail & Related papers (2024-12-05T18:42:29Z) - Addressing Heterogeneity in Federated Learning: Challenges and Solutions for a Shared Production Environment [1.2499537119440245]
Federated learning (FL) has emerged as a promising approach to training machine learning models across decentralized data sources.
This paper provides a comprehensive overview of data heterogeneity in FL within the context of manufacturing.
We discuss the impact of these types of heterogeneity on model training and review current methodologies for mitigating their adverse effects.
arXiv Detail & Related papers (2024-08-18T17:49:44Z) - Curating Grounded Synthetic Data with Global Perspectives for Equitable AI [0.5120567378386615]
We introduce a novel approach to creating synthetic datasets, grounded in real-world diversity and enriched through strategic diversification.
We synthesize data using a comprehensive collection of news articles spanning 12 languages and originating from 125 countries, to ensure a breadth of linguistic and cultural representations.
Preliminary results demonstrate substantial improvements in performance on traditional NER benchmarks, by up to 7.3%.
arXiv Detail & Related papers (2024-06-10T17:59:11Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - Balancing Similarity and Complementarity for Federated Learning [91.65503655796603]
Federated Learning (FL) is increasingly important in mobile and IoT systems.
One key challenge in FL is managing statistical heterogeneity, such as non-i.i.d. data.
We introduce a novel framework, textttFedSaC, which balances similarity and complementarity in FL cooperation.
arXiv Detail & Related papers (2024-05-16T08:16:19Z) - Enhancing Data Quality in Federated Fine-Tuning of Foundation Models [54.757324343062734]
We propose a data quality control pipeline for federated fine-tuning of foundation models.
This pipeline computes scores reflecting the quality of training data and determines a global threshold for a unified standard.
Our experiments show that the proposed quality control pipeline facilitates the effectiveness and reliability of the model training, leading to better performance.
arXiv Detail & Related papers (2024-03-07T14:28:04Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Filling the Missing: Exploring Generative AI for Enhanced Federated
Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data.
Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z) - Evaluating and Incentivizing Diverse Data Contributions in Collaborative
Learning [89.21177894013225]
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset.
We show that the statistical criterion used to quantify the diversity of the data, as well as the choice of the federated learning algorithm used, has a significant effect on the resulting equilibrium.
We leverage this to design simple optimal federated learning mechanisms that encourage data collectors to contribute data representative of the global population.
arXiv Detail & Related papers (2023-06-08T23:38:25Z) - Quality Not Quantity: On the Interaction between Dataset Design and
Robustness of CLIP [43.7219097444333]
We introduce a testbed of six publicly available data sources to investigate how pre-training distributions induce robustness in CLIP.
We find that the performance of the pre-training data varies substantially across distribution shifts.
We find that combining multiple sources does not necessarily yield better models, but rather dilutes the robustness of the best individual data source.
arXiv Detail & Related papers (2022-08-10T18:24:23Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.