Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game
- URL: http://arxiv.org/abs/2412.16079v1
- Date: Fri, 20 Dec 2024 17:23:12 GMT
- Title: Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game
- Authors: Sebastian Niehaus, Ingo Roeder, Nico Scherf,
- Abstract summary: We consider distributed learning as an Stackelberg evolutionary game.
We use three medical datasets to highlight the impact of dynamic weighting on underrepresented nodes in distributed learning.
- Score: 0.0
- License:
- Abstract: Decentralised learning enables the training of deep learning algorithms without centralising data sets, resulting in benefits such as improved data privacy, operational efficiency and the fostering of data ownership policies. However, significant data imbalances pose a challenge in this framework. Participants with smaller datasets in distributed learning environments often achieve poorer results than participants with larger datasets. Data imbalances are particularly pronounced in medical fields and are caused by different patient populations, technological inequalities and divergent data collection practices. In this paper, we consider distributed learning as an Stackelberg evolutionary game. We present two algorithms for setting the weights of each node's contribution to the global model in each training round: the Deterministic Stackelberg Weighting Model (DSWM) and the Adaptive Stackelberg Weighting Model (ASWM). We use three medical datasets to highlight the impact of dynamic weighting on underrepresented nodes in distributed learning. Our results show that the ASWM significantly favours underrepresented nodes by improving their performance by 2.713% in AUC. Meanwhile, nodes with larger datasets experience only a modest average performance decrease of 0.441%.
Related papers
- Data-Efficient Pretraining with Group-Level Data Influence Modeling [49.18903821780051]
Group-Level Data Influence Modeling (Group-MATES) is a novel data-efficient pretraining method.
Group-MATES collects oracle group-level influences by locally probing the pretraining model with data sets.
It then fine-tunes a relational data influence model to approximate oracles as relationship-weighted aggregations of individual influences.
arXiv Detail & Related papers (2025-02-20T16:34:46Z) - Data Assetization via Resources-decoupled Federated Learning [7.347554648348435]
Federated learning (FL) provides an effective approach to collaborative training models while preserving privacy.
We first propose a framework for resource-decoupled FL involving three parties.
Next, we propose the Quality-aware Dynamic Resources-decoupled FL algorithm (QD-RDFL)
arXiv Detail & Related papers (2025-01-24T15:49:04Z) - Enhancing Performance for Highly Imbalanced Medical Data via Data Regularization in a Federated Learning Setting [6.22153888560487]
The goal of the proposed method is to enhance model performance for cardiovascular disease prediction.
The method is evaluated across four datasets for cardiovascular disease prediction, which are scattered across different clients.
arXiv Detail & Related papers (2024-05-30T19:15:38Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - Investigating Group Distributionally Robust Optimization for Deep
Imbalanced Learning: A Case Study of Binary Tabular Data Classification [0.44040106718326594]
Group distributionally robust optimization (gDRO) is investigated in this study for imbalance learning.
Experimental findings in comparison with empirical risk minimization (ERM) and classical imbalance methods reveal impressive performance of gDRO.
arXiv Detail & Related papers (2023-03-04T21:20:58Z) - Decentralized Learning with Multi-Headed Distillation [12.90857834791378]
Decentralized learning with private data is a central problem in machine learning.
We propose a novel distillation-based decentralized learning technique that allows multiple agents with private non-iid data to learn from each other.
arXiv Detail & Related papers (2022-11-28T21:01:43Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Generic Semi-Supervised Adversarial Subject Translation for Sensor-Based
Human Activity Recognition [6.2997667081978825]
This paper presents a novel generic and robust approach for semi-supervised domain adaptation in Human Activity Recognition.
It capitalizes on the advantages of the adversarial framework to tackle the shortcomings, by leveraging knowledge from annotated samples exclusively from the source subject and unlabeled ones of the target subject.
The results demonstrate the effectiveness of our proposed algorithms over state-of-the-art methods, which led in up to 13%, 4%, and 13% improvement of our high-level activities recognition metrics for Opportunity, LISSI, and PAMAP2 datasets.
arXiv Detail & Related papers (2020-11-11T12:16:23Z) - Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers.
Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.