Related papers: Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

URL: http://arxiv.org/abs/2308.03035v1
Date: Sun, 6 Aug 2023 06:51:32 GMT
Title: Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining
Authors: Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang
Abstract summary: Area Under Precision-Recall (AUPRC) was introduced as an effective metric. Serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck. We propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC.
Score: 119.89373423433804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-party collaborative training, such as distributed learning and federated learning, is used to address the big data challenges. However, traditional multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (\emph{e.g.}, cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although single-machine AUPRC maximization methods have been designed, multi-party collaborative algorithm has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem.

Related papers

Efficient Federated Split Learning for Large Language Models over Communication Networks [14.461758448289908]
Fine-tuning pre-trained large language models (LLM) in a distributed manner poses significant challenges on resource-constrained edge devices. We propose FedsLLM, a novel framework that integrates split federated learning with parameter-efficient fine-tuning techniques.
arXiv Detail & Related papers (2025-04-20T16:16:54Z)
Faster Convergence on Heterogeneous Federated Edge Learning: An Adaptive Clustered Data Sharing Approach [27.86468387141422]
Federated Edge Learning (FEEL) emerges as a pioneering distributed machine learning paradigm for the 6G Hyper-Connectivity. Current FEEL algorithms struggle with non-independent and non-identically distributed (non-IID) data, leading to elevated communication costs and compromised model accuracy. We introduce a clustered data sharing framework, mitigating data heterogeneity by selectively sharing partial data from cluster heads to trusted associates. Experiments show that the proposed framework facilitates FEEL on non-IID datasets with faster convergence rate and higher model accuracy in a limited communication environment.
arXiv Detail & Related papers (2024-06-14T07:22:39Z)
Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy. The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms. We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z)
Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification [50.406127962933915]
ACOWA allows an extra round of communication to achieve noticeably better approximation quality with minor runtime increases. Results show that ACOWA obtains solutions that are more faithful to the empirical risk minimizer and attain substantially higher accuracy than other distributed algorithms.
arXiv Detail & Related papers (2024-06-03T19:43:06Z)
ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices [22.664980594996155]
Federated learning (FL) allows multiple parties (distributed devices) to train a machine learning model without sharing raw data. We propose an efficient split federated learning algorithm (ESFL) to take full advantage of the powerful computing capabilities at a central server.
arXiv Detail & Related papers (2024-02-24T20:50:29Z)
DePRL: Achieving Linear Convergence Speedup in Personalized Decentralized Learning with Shared Representations [31.47686582044592]
We propose a novel personalized decentralized learning algorithm named DePRL via shared representations. For the first time, DePRL achieves a provable linear speedup for convergence with general non-linear representations.
arXiv Detail & Related papers (2023-12-17T20:53:37Z)
Communication-Efficient Federated Non-Linear Bandit Optimization [26.23638987873429]
We propose a new algorithm, named Fed-GO-UCB, for federated bandit optimization with generic non-linear objective function. Under some mild conditions, we rigorously prove that Fed-GO-UCB is able to achieve sub-linear rate for both cumulative regret and communication cost.
arXiv Detail & Related papers (2023-11-03T03:50:31Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Resource-constrained Federated Edge Learning with Heterogeneous Data: Formulation and Analysis [8.863089484787835]
We propose a distributed approximate Newton-type Newton-type training scheme, namely FedOVA, to solve the heterogeneous statistical challenge brought by heterogeneous data. FedOVA decomposes a multi-class classification problem into more straightforward binary classification problems and then combines their respective outputs using ensemble learning.
arXiv Detail & Related papers (2021-10-14T17:35:24Z)
Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically. Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers. To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.