Related papers: Data Assetization via Resources-decoupled Federated Learning

Data Assetization via Resources-decoupled Federated Learning

URL: http://arxiv.org/abs/2501.14588v2
Date: Tue, 11 Feb 2025 09:03:49 GMT
Title: Data Assetization via Resources-decoupled Federated Learning
Authors: Jianzhe Zhao, Feida Zhu, Lingyan He, Zixin Tang, Mingce Gao, Shiyu Yang, Guibing Guo,
Abstract summary: Federated learning (FL) provides an effective approach to collaborative training models while preserving privacy.<n>We first propose a framework for resource-decoupled FL involving three parties.<n>Next, we propose the Quality-aware Dynamic Resources-decoupled FL algorithm (QD-RDFL)
Score: 7.347554648348435
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the development of the digital economy, data is increasingly recognized as an essential resource for both work and life. However, due to privacy concerns, data owners tend to maximize the value of data through the circulation of information rather than direct data transfer. Federated learning (FL) provides an effective approach to collaborative training models while preserving privacy. However, as model parameters and training data grow, there are not only real differences in data resources between different data owners, but also mismatches between data and computing resources. These challenges lead to inadequate collaboration among data owners, compute centers, and model owners, reducing the global utility of the three parties and the effectiveness of data assetization. In this work, we first propose a framework for resource-decoupled FL involving three parties. Then, we design a Tripartite Stackelberg Model and theoretically analyze the Stackelberg-Nash equilibrium (SNE) for participants to optimize global utility. Next, we propose the Quality-aware Dynamic Resources-decoupled FL algorithm (QD-RDFL), in which we derive and solve the optimal strategies of all parties to achieve SNE using backward induction. We also design a dynamic optimization mechanism to improve the optimal strategy profile by evaluating the contribution of data quality from data owners to the global model during real training. Finally, our extensive experiments demonstrate that our method effectively encourages the linkage of the three parties involved, maximizing the global utility and value of data assets.

Related papers

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment [29.703775936837012]
Large Language Models (LLMs) have achieved impressive performance through Supervised Fine-tuning (SFT) on diverse instructional datasets.<n>When training on multiple capabilities simultaneously, the mixture training dataset, governed by volumes of data from different domains, is a critical factor that directly impacts the final model's performance.<n>We introduce an innovative data equilibrium framework designed to effectively optimize volumes of data from different domains within mixture SFT datasets.
arXiv Detail & Related papers (2025-05-19T06:42:44Z)
FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning [12.307490659840845]
Federated Learning (FL) combines locally optimized models from various clients into a unified global model.<n>FL encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model.<n>We introduce an innovative dual-strategy approach designed to effectively resolve these issues.
arXiv Detail & Related papers (2024-12-05T18:42:29Z)
MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services [94.61039892220037]
We propose an immersion-aware model trading framework that facilitates data provision for services while ensuring privacy through federated learning (FL) We design an incentive mechanism to incentivize metaverse users (MUs) to contribute high-value models under resource constraints. We develop a fully distributed dynamic reward algorithm based on deep reinforcement learning, without accessing any private information about MUs and other MSPs.
arXiv Detail & Related papers (2024-10-25T16:20:46Z)
Adversarial Federated Consensus Learning for Surface Defect Classification Under Data Heterogeneity in IIoT [8.48069043458347]
It's difficult to collect and centralize sufficient training data from various entities in Industrial Internet of Things (IIoT) Federated learning (FL) provides a solution by enabling collaborative global model training across clients. We propose a novel personalized FL approach, named Adversarial Federated Consensus Learning (AFedCL)
arXiv Detail & Related papers (2024-09-24T03:59:32Z)
Addressing Heterogeneity in Federated Learning: Challenges and Solutions for a Shared Production Environment [1.2499537119440245]
Federated learning (FL) has emerged as a promising approach to training machine learning models across decentralized data sources. This paper provides a comprehensive overview of data heterogeneity in FL within the context of manufacturing. We discuss the impact of these types of heterogeneity on model training and review current methodologies for mitigating their adverse effects.
arXiv Detail & Related papers (2024-08-18T17:49:44Z)
Curating Grounded Synthetic Data with Global Perspectives for Equitable AI [0.5120567378386615]
We introduce a novel approach to creating synthetic datasets, grounded in real-world diversity and enriched through strategic diversification. We synthesize data using a comprehensive collection of news articles spanning 12 languages and originating from 125 countries, to ensure a breadth of linguistic and cultural representations. Preliminary results demonstrate substantial improvements in performance on traditional NER benchmarks, by up to 7.3%.
arXiv Detail & Related papers (2024-06-10T17:59:11Z)
FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data. Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z)
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models [54.757324343062734]
We propose a data quality control pipeline for federated fine-tuning of foundation models. This pipeline computes scores reflecting the quality of training data and determines a global threshold for a unified standard. Our experiments show that the proposed quality control pipeline facilitates the effectiveness and reliability of the model training, leading to better performance.
arXiv Detail & Related papers (2024-03-07T14:28:04Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Filling the Missing: Exploring Generative AI for Enhanced Federated Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data. Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z)
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning [89.21177894013225]
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset. We show that the statistical criterion used to quantify the diversity of the data, as well as the choice of the federated learning algorithm used, has a significant effect on the resulting equilibrium. We leverage this to design simple optimal federated learning mechanisms that encourage data collectors to contribute data representative of the global population.
arXiv Detail & Related papers (2023-06-08T23:38:25Z)
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP [43.7219097444333]
We introduce a testbed of six publicly available data sources to investigate how pre-training distributions induce robustness in CLIP. We find that the performance of the pre-training data varies substantially across distribution shifts. We find that combining multiple sources does not necessarily yield better models, but rather dilutes the robustness of the best individual data source.
arXiv Detail & Related papers (2022-08-10T18:24:23Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources. The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion. This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.