Are generative models fair? A study of racial bias in dermatological image generation
- URL: http://arxiv.org/abs/2501.11752v2
- Date: Wed, 19 Feb 2025 15:53:04 GMT
- Title: Are generative models fair? A study of racial bias in dermatological image generation
- Authors: Miguel López-Pérez, Søren Hauberg, Aasa Feragen,
- Abstract summary: We evaluate the fairness of generative models in clinical dermatology with respect to racial bias.
We utilize the Fitzpatrick17k dataset to examine how racial bias influences the representation and performance of these models.
- Score: 15.812312064457865
- License:
- Abstract: Racial bias in medicine, such as in dermatology, presents significant ethical and clinical challenges. This is likely to happen because there is a significant underrepresentation of darker skin tones in training datasets for machine learning models. While efforts to address bias in dermatology have focused on improving dataset diversity and mitigating disparities in discriminative models, the impact of racial bias on generative models remains underexplored. Generative models, such as Variational Autoencoders (VAEs), are increasingly used in healthcare applications, yet their fairness across diverse skin tones is currently not well understood. In this study, we evaluate the fairness of generative models in clinical dermatology with respect to racial bias. For this purpose, we first train a VAE with a perceptual loss to generate and reconstruct high-quality skin images across different skin tones. We utilize the Fitzpatrick17k dataset to examine how racial bias influences the representation and performance of these models. Our findings indicate that VAE performance is, as expected, influenced by representation, i.e. increased skin tone representation comes with increased performance on the given skin tone. However, we also observe, even independently of representation, that the VAE performs better for lighter skin tones. Additionally, the uncertainty estimates produced by the VAE are ineffective in assessing the model's fairness. These results highlight the need for more representative dermatological datasets, but also a need for better understanding the sources of bias in such model, as well as improved uncertainty quantification mechanisms to detect and address racial bias in generative models for trustworthy healthcare technologies.
Related papers
- FairSkin: Fair Diffusion for Skin Disease Image Generation [54.29840149709033]
Diffusion Model (DM) has become a leading method in generating synthetic medical images, but it suffers from a critical twofold bias.
We propose FairSkin, a novel DM framework that mitigates these biases through a three-level resampling mechanism.
Our approach significantly improves the diversity and quality of generated images, contributing to more equitable skin disease detection in clinical settings.
arXiv Detail & Related papers (2024-10-29T21:37:03Z) - Skin Cancer Machine Learning Model Tone Bias [1.1545092788508224]
Many open-source skin cancer image datasets are the result of clinical trials conducted in countries with lighter skin tones.
Due to this tone imbalance, machine learning models can perform well at detecting skin cancer for lighter skin tones.
Any tone bias in these models could introduce fairness concerns and reduce public trust in the artificial intelligence health field.
arXiv Detail & Related papers (2024-10-08T21:33:02Z) - Evaluating Machine Learning-based Skin Cancer Diagnosis [0.0]
The research assesses two convolutional neural network architectures: a MobileNet-based model and a custom CNN model.
Both models are evaluated for their ability to classify skin lesions into seven categories and to distinguish between dangerous and benign lesions.
The study concludes that while the models show promise in explainability, further development is needed to ensure fairness across different skin tones.
arXiv Detail & Related papers (2024-09-04T02:44:48Z) - DDI-CoCo: A Dataset For Understanding The Effect Of Color Contrast In
Machine-Assisted Skin Disease Detection [51.92255321684027]
We study the interaction between skin tone and color difference effects and suggest that color difference can be an additional reason behind model performance bias between skin tones.
Our work provides a complementary angle to dermatology AI for improving skin disease detection.
arXiv Detail & Related papers (2024-01-24T07:45:24Z) - Improving Fairness using Vision-Language Driven Image Augmentation [60.428157003498995]
Fairness is crucial when training a deep-learning discriminative model, especially in the facial domain.
Models tend to correlate specific characteristics (such as age and skin color) with unrelated attributes (downstream tasks)
This paper proposes a method to mitigate these correlations to improve fairness.
arXiv Detail & Related papers (2023-11-02T19:51:10Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Assessing the Generalizability of Deep Neural Networks-Based Models for
Black Skin Lesions [5.799408310835583]
Melanoma is more common in black people, often affecting acral regions: palms, soles, and nails.
Deep neural networks have shown tremendous potential for improving clinical care and skin cancer diagnosis.
In this work, we evaluate supervised and self-supervised models in skin lesion images extracted from acral regions commonly observed in black individuals.
arXiv Detail & Related papers (2023-09-30T22:36:51Z) - Analyzing Bias in Diffusion-based Face Generation Models [75.80072686374564]
Diffusion models are increasingly popular in synthetic data generation and image editing applications.
We investigate the presence of bias in diffusion-based face generation models with respect to attributes such as gender, race, and age.
We examine how dataset size affects the attribute composition and perceptual quality of both diffusion and Generative Adversarial Network (GAN) based face generation models.
arXiv Detail & Related papers (2023-05-10T18:22:31Z) - FairDisCo: Fairer AI in Dermatology via Disentanglement Contrastive
Learning [11.883809920936619]
We propose FairDisCo, a disentanglement deep learning framework with contrastive learning.
We compare FairDisCo to three fairness methods, namely, resampling, reweighting, and attribute-aware.
We adapt two fairness-based metrics DPM and EOM for our multiple classes and sensitive attributes task, highlighting the skin-type bias in skin lesion classification.
arXiv Detail & Related papers (2022-08-22T01:54:23Z) - EdgeMixup: Improving Fairness for Skin Disease Classification and
Segmentation [9.750368551427494]
Skin lesions can be an early indicator of a wide range of infectious and other diseases.
The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients.
These models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones.
arXiv Detail & Related papers (2022-02-28T15:33:31Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.