Risk of Training Diagnostic Algorithms on Data with Demographic Bias
        - URL: http://arxiv.org/abs/2005.10050v2
- Date: Wed, 17 Jun 2020 11:33:59 GMT
- Title: Risk of Training Diagnostic Algorithms on Data with Demographic Bias
- Authors: Samaneh Abbasi-Sureshjani, Ralf Raumanns, Britt E. J. Michels, Gerard
  Schouten, Veronika Cheplygina
- Abstract summary: We conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications.
Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used.
We show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup.
- Score: 0.5599792629509227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   One of the critical challenges in machine learning applications is to have
fair predictions. There are numerous recent examples in various domains that
convincingly show that algorithms trained with biased datasets can easily lead
to erroneous or discriminatory conclusions. This is even more crucial in
clinical applications where the predictive algorithms are designed mainly based
on a limited or given set of medical images and demographic variables such as
age, sex and race are not taken into account. In this work, we conduct a survey
of the MICCAI 2018 proceedings to investigate the common practice in medical
image analysis applications. Surprisingly, we found that papers focusing on
diagnosis rarely describe the demographics of the datasets used, and the
diagnosis is purely based on images. In order to highlight the importance of
considering the demographics in diagnosis tasks, we used a publicly available
dataset of skin lesions. We then demonstrate that a classifier with an overall
area under the curve (AUC) of 0.83 has variable performance between 0.76 and
0.91 on subgroups based on age and sex, even though the training set was
relatively balanced. Moreover, we show that it is possible to learn unbiased
features by explicitly using demographic variables in an adversarial training
setup, which leads to balanced scores per subgroups. Finally, we discuss the
implications of these results and provide recommendations for further research.
 
      
        Related papers
        - Debias-CLR: A Contrastive Learning Based Debiasing Method for   Algorithmic Fairness in Healthcare Applications [0.17624347338410748]
 We proposed an implicit in-processing debiasing method to combat disparate treatment.
We used clinical notes of heart failure patients and used diagnostic codes, procedure reports and physiological vitals of the patients.
We found that Debias-CLR was able to reduce the Single-Category Word Embedding Association Test (SC-WEAT) effect size score when debiasing for gender and ethnicity.
 arXiv  Detail & Related papers  (2024-11-15T19:32:01Z)
- Fairness Evolution in Continual Learning for Medical Imaging [47.52603262576663]
 We study the behavior of Continual Learning (CL) strategies in medical imaging regarding classification performance.
We evaluate the Replay, Learning without Forgetting (LwF), LwF, and Pseudo-Label strategies.
LwF and Pseudo-Label exhibit optimal classification performance, but when including fairness metrics in the evaluation, it is clear that Pseudo-Label is less biased.
 arXiv  Detail & Related papers  (2024-04-10T09:48:52Z)
- Demographic Bias of Expert-Level Vision-Language Foundation Models in
  Medical Imaging [13.141767097232796]
 Self-supervised vision-language foundation models can detect a broad spectrum of pathologies without relying on explicit training annotations.
It is crucial to ensure that these AI models do not mirror or amplify human biases, thereby disadvantaging historically marginalized groups such as females or Black patients.
This study investigates the algorithmic fairness of state-of-the-art vision-language foundation models in chest X-ray diagnosis across five globally-sourced datasets.
 arXiv  Detail & Related papers  (2024-02-22T18:59:53Z)
- An AI-Guided Data Centric Strategy to Detect and Mitigate Biases in
  Healthcare Datasets [32.25265709333831]
 We generate a data-centric, model-agnostic, task-agnostic approach to evaluate dataset bias by investigating the relationship between how easily different groups are learned at small sample sizes (AEquity)
We then apply a systematic analysis of AEq values across subpopulations to identify and manifestations of racial bias in two known cases in healthcare.
AEq is a novel and broadly applicable metric that can be applied to advance equity by diagnosing and remediating bias in healthcare datasets.
 arXiv  Detail & Related papers  (2023-11-06T17:08:41Z)
- Multi-task Explainable Skin Lesion Classification [54.76511683427566]
 We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
 arXiv  Detail & Related papers  (2023-10-11T05:49:47Z)
- Adapting Machine Learning Diagnostic Models to New Populations Using a   Small Amount of Data: Results from Clinical Neuroscience [21.420302408947194]
 We develop a weighted empirical risk minimization approach that optimally combines data from a source group to make predictions on a target group.
We apply this method to multi-source data of 15,363 individuals from 20 neuroimaging studies to build ML models for diagnosis of Alzheimer's disease and estimation of brain age.
 arXiv  Detail & Related papers  (2023-08-06T18:05:39Z)
- Towards unraveling calibration biases in medical image analysis [2.4054878434935074]
 We show how several typically employed calibration metrics are systematically biased with respect to sample sizes.
This is of particular relevance to fairness studies, where data imbalance results in drastic sample size differences between demographic sub-groups.
 arXiv  Detail & Related papers  (2023-05-09T00:11:35Z)
- Fairness meets Cross-Domain Learning: a new perspective on Models and
  Metrics [80.07271410743806]
 We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
 arXiv  Detail & Related papers  (2023-03-25T09:34:05Z)
- IA-GCN: Interpretable Attention based Graph Convolutional Network for
  Disease prediction [47.999621481852266]
 We propose an interpretable graph learning-based model which interprets the clinical relevance of the input features towards the task.
In a clinical scenario, such a model can assist the clinical experts in better decision-making for diagnosis and treatment planning.
Our proposed model shows superior performance with respect to compared methods with an increase in an average accuracy of 3.2% for Tadpole, 1.6% for UKBB Gender, and 2% for the UKBB Age prediction task.
 arXiv  Detail & Related papers  (2021-03-29T13:04:02Z)
- Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
 There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
 arXiv  Detail & Related papers  (2021-03-16T15:05:49Z)
- Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
 We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
 arXiv  Detail & Related papers  (2021-03-07T03:10:32Z)
- Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
  Prediction [55.94378672172967]
 We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
 arXiv  Detail & Related papers  (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.