MEDFAIR: Benchmarking Fairness for Medical Imaging
- URL: http://arxiv.org/abs/2210.01725v1
- Date: Tue, 4 Oct 2022 16:30:47 GMT
- Title: MEDFAIR: Benchmarking Fairness for Medical Imaging
- Authors: Yongshuo Zong, Yongxin Yang, Timothy Hospedales
- Abstract summary: MEDFAIR is a framework to benchmark the fairness of machine learning models for medical imaging.
We find that the under-studied issue of model selection criterion can have a significant impact on fairness outcomes.
We make recommendations for different medical application scenarios that require different ethical principles.
- Score: 44.73351338165214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A multitude of work has shown that machine learning-based medical diagnosis
systems can be biased against certain subgroups of people. This has motivated a
growing number of bias mitigation algorithms that aim to address fairness
issues in machine learning. However, it is difficult to compare their
effectiveness in medical imaging for two reasons. First, there is little
consensus on the criteria to assess fairness. Second, existing bias mitigation
algorithms are developed under different settings, e.g., datasets, model
selection strategies, backbones, and fairness metrics, making a direct
comparison and evaluation based on existing results impossible. In this work,
we introduce MEDFAIR, a framework to benchmark the fairness of machine learning
models for medical imaging. MEDFAIR covers eleven algorithms from various
categories, nine datasets from different imaging modalities, and three model
selection criteria. Through extensive experiments, we find that the
under-studied issue of model selection criterion can have a significant impact
on fairness outcomes; while in contrast, state-of-the-art bias mitigation
algorithms do not significantly improve fairness outcomes over empirical risk
minimization (ERM) in both in-distribution and out-of-distribution settings. We
evaluate fairness from various perspectives and make recommendations for
different medical application scenarios that require different ethical
principles. Our framework provides a reproducible and easy-to-use entry point
for the development and evaluation of future bias mitigation algorithms in deep
learning. Code is available at https://github.com/ys-zong/MEDFAIR.
Related papers
- On Biases in a UK Biobank-based Retinal Image Classification Model [0.0]
We explore whether disparities are present in the UK Biobank fundus retinal images by training and evaluating a disease classification model on these images.
We find substantial differences despite strong overall performance of the model.
We find that these methods are largely unable to enhance fairness, highlighting the need for better bias mitigation methods tailored to the specific type of bias.
arXiv Detail & Related papers (2024-07-30T10:50:07Z) - FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models [37.803490266325]
We introduce FairMedFM, a fairness benchmark for foundation models (FMs) research in medical imaging.
FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes.
It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation.
arXiv Detail & Related papers (2024-07-01T05:47:58Z) - FairGridSearch: A Framework to Compare Fairness-Enhancing Models [0.0]
This paper focuses on binary classification and proposes FairGridSearch, a novel framework for comparing fairness-enhancing models.
The study applies FairGridSearch to three popular datasets (Adult, COMPAS, and German Credit) and analyzes the impacts of metric selection, base estimator choice, and classification threshold on model fairness.
arXiv Detail & Related papers (2024-01-04T10:29:02Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - FairAdaBN: Mitigating unfairness with adaptive batch normalization and
its application to dermatological disease classification [14.589159162086926]
We propose FairAdaBN, which makes batch normalization adaptive to sensitive attribute.
We propose a new metric, named Fairness-Accuracy Trade-off Efficiency (FATE), to compute normalized fairness improvement over accuracy drop.
Experiments on two dermatological datasets show that our proposed method outperforms other methods on fairness criteria and FATE.
arXiv Detail & Related papers (2023-03-15T02:22:07Z) - Mitigating Health Disparities in EHR via Deconfounder [5.511343163506091]
We propose a novel framework, Parity Medical Deconfounder (PriMeD), to deal with the disparity issue in healthcare datasets.
PriMeD adopts a Conditional Variational Autoencoder (CVAE) to learn latent factors (substitute confounders) for observational data.
arXiv Detail & Related papers (2022-10-28T05:16:50Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.