On the Fairness of Swarm Learning in Skin Lesion Classification
- URL: http://arxiv.org/abs/2109.12176v1
- Date: Fri, 24 Sep 2021 20:20:24 GMT
- Title: On the Fairness of Swarm Learning in Skin Lesion Classification
- Authors: Di Fan, Yifan Wu, Xiaoxiao Li
- Abstract summary: Distributed and collaborative learning is an approach to involve training models in massive, heterogeneous, and distributed data sources.
We present an empirical study by comparing the fairness among single (node) training, SL, centralized training.
Experiments demonstrate that SL does not exacerbate the fairness problem compared to centralized training.
- Score: 22.896631007125244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: in healthcare. However, the existing AI model may be biased in its decision
marking. The bias induced by data itself, such as collecting data in subgroups
only, can be mitigated by including more diversified data. Distributed and
collaborative learning is an approach to involve training models in massive,
heterogeneous, and distributed data sources, also known as nodes. In this work,
we target on examining the fairness issue in Swarm Learning (SL), a recent
edge-computing based decentralized machine learning approach, which is designed
for heterogeneous illnesses detection in precision medicine. SL has achieved
high performance in clinical applications, but no attempt has been made to
evaluate if SL can improve fairness. To address the problem, we present an
empirical study by comparing the fairness among single (node) training, SL,
centralized training. Specifically, we evaluate on large public available skin
lesion dataset, which contains samples from various subgroups. The experiments
demonstrate that SL does not exacerbate the fairness problem compared to
centralized training and improves both performance and fairness compared to
single training. However, there still exists biases in SL model and the
implementation of SL is more complex than the alternative two strategies.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - SPLAL: Similarity-based pseudo-labeling with alignment loss for
semi-supervised medical image classification [11.435826510575879]
Semi-supervised learning (SSL) methods can mitigate challenges by leveraging both labeled and unlabeled data.
SSL methods for medical image classification need to address two key challenges: (1) estimating reliable pseudo-labels for the images in the unlabeled dataset and (2) reducing biases caused by class imbalance.
In this paper, we propose a novel SSL approach, SPLAL, that effectively addresses these challenges.
arXiv Detail & Related papers (2023-07-10T14:53:24Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - Adaptive Personlization in Federated Learning for Highly Non-i.i.d. Data [37.667379000751325]
Federated learning (FL) is a distributed learning method that offers medical institutes the prospect of collaboration in a global model.
In this work, we investigate an adaptive hierarchical clustering method for FL to produce intermediate semi-global models.
Our experiments demonstrate significant performance gain in heterogeneous distribution compared to standard FL methods in classification accuracy.
arXiv Detail & Related papers (2022-07-07T17:25:04Z) - A study on the distribution of social biases in self-supervised learning
visual models [1.8692254863855964]
Self-Supervised Learning (SSL) wrongly appears as an efficient and bias-free solution, as it does not require labelled data.
We show that there is a correlation between the type of the SSL model and the number of biases that it incorporates.
We conclude that a careful SSL model selection process can reduce the number of social biases in the deployed model.
arXiv Detail & Related papers (2022-03-03T17:03:21Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Semi-supervised Medical Image Classification with Global Latent Mixing [8.330337646455957]
Computer-aided diagnosis via deep learning relies on large-scale annotated data sets.
Semi-supervised learning mitigates this challenge by leveraging unlabeled data.
We present a novel SSL approach that trains the neural network on linear mixing of labeled and unlabeled data.
arXiv Detail & Related papers (2020-05-22T14:49:13Z) - Imbalanced Data Learning by Minority Class Augmentation using Capsule
Adversarial Networks [31.073558420480964]
We propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods.
In our model, generative and discriminative networks play a novel competitive game.
The coalescing of capsule-GAN is effective at recognizing highly overlapping classes with much fewer parameters compared with the convolutional-GAN.
arXiv Detail & Related papers (2020-04-05T12:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.