Colorimeter-Supervised Skin Tone Estimation from Dermatoscopic Images for Fairness Auditing
- URL: http://arxiv.org/abs/2602.10265v1
- Date: Tue, 10 Feb 2026 20:20:45 GMT
- Title: Colorimeter-Supervised Skin Tone Estimation from Dermatoscopic Images for Fairness Auditing
- Authors: Marin Benčević, Krešimir Romić, Ivana Hartmann Tolić, Irena Galić,
- Abstract summary: We develop neural networks that predict Fitzpatrick skin type via ordinal regression and the Individual Typology Angle (ITA) via color regression.<n>We release code and pretrained models as an open-source tool for rapid skin-tone annotation and bias auditing.<n>This is, to our knowledge, the first dermatoscopic skin-tone estimation neural network validated against colorimeter measurements.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Neural-network-based diagnosis from dermatoscopic images is increasingly used for clinical decision support, yet studies report performance disparities across skin tones. Fairness auditing of these models is limited by the lack of reliable skin-tone annotations in public dermatoscopy datasets. We address this gap with neural networks that predict Fitzpatrick skin type via ordinal regression and the Individual Typology Angle (ITA) via color regression, using in-person Fitzpatrick labels and colorimeter measurements as targets. We further leverage extensive pretraining on synthetic and real dermatoscopic and clinical images. The Fitzpatrick model achieves agreement comparable to human crowdsourced annotations, and ITA predictions show high concordance with colorimeter-derived ITA, substantially outperforming pixel-averaging approaches. Applying these estimators to ISIC 2020 and MILK10k, we find that fewer than 1% of subjects belong to Fitzpatrick types V and VI. We release code and pretrained models as an open-source tool for rapid skin-tone annotation and bias auditing. This is, to our knowledge, the first dermatoscopic skin-tone estimation neural network validated against colorimeter measurements, and it supports growing evidence of clinically relevant performance gaps across skin-tone groups.
Related papers
- Adapting Large Language Models to Mitigate Skin Tone Biases in Clinical Dermatology Tasks: A Mixed-Methods Study [2.3034630097498883]
We evaluated performance biases in SkinGPT-4 across skin tones on common skin diseases.<n>We leveraged the SkinGPT-4 backbone to develop finetuned models for custom skin disease classification tasks.
arXiv Detail & Related papers (2025-09-28T09:40:40Z) - Skin Color Measurement from Dermatoscopic Images: An Evaluation on a Synthetic Dataset [0.0]
We assess four classes of image colorimetry approaches: segmentation-based, patch-based, color quantization, and neural networks.<n>Our results show that segmentation-based and color quantization methods yield robust, lighting-invariant estimates.<n>Neural network models, particularly when combined with heavy blurring to reduce overfitting, can provide light-invariant Fitzpatrick predictions.
arXiv Detail & Related papers (2025-04-06T13:57:34Z) - FairSkin: Fair Diffusion for Skin Disease Image Generation [54.29840149709033]
Diffusion Model (DM) has become a leading method in generating synthetic medical images, but it suffers from a critical twofold bias.
We propose FairSkin, a novel DM framework that mitigates these biases through a three-level resampling mechanism.
Our approach significantly improves the diversity and quality of generated images, contributing to more equitable skin disease detection in clinical settings.
arXiv Detail & Related papers (2024-10-29T21:37:03Z) - DDI-CoCo: A Dataset For Understanding The Effect Of Color Contrast In
Machine-Assisted Skin Disease Detection [51.92255321684027]
We study the interaction between skin tone and color difference effects and suggest that color difference can be an additional reason behind model performance bias between skin tones.
Our work provides a complementary angle to dermatology AI for improving skin disease detection.
arXiv Detail & Related papers (2024-01-24T07:45:24Z) - Revisiting Skin Tone Fairness in Dermatological Lesion Classification [3.247628857305427]
We review and compare four ITA-based approaches of skin tone classification on the ISIC18 dataset.
Our analyses reveal a high disagreement among previously published studies demonstrating the risks of ITA-based skin tone estimation methods.
We investigate the causes of such large discrepancy among these approaches and find that the lack of diversity in the ISIC18 dataset limits its use as a testbed for fairness analysis.
arXiv Detail & Related papers (2023-08-18T15:59:55Z) - How Does Pruning Impact Long-Tailed Multi-Label Medical Image
Classifiers? [49.35105290167996]
Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance.
This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.
arXiv Detail & Related papers (2023-08-17T20:40:30Z) - Automatic Facial Skin Feature Detection for Everyone [60.31670960526022]
We present an automatic facial skin feature detection method that works across a variety of skin tones and age groups for selfies in the wild.
To be specific, we annotate the locations of acne, pigmentation, and wrinkle for selfie images with different skin tone colors, severity levels, and lighting conditions.
arXiv Detail & Related papers (2022-03-30T04:52:54Z) - EdgeMixup: Improving Fairness for Skin Disease Classification and
Segmentation [9.750368551427494]
Skin lesions can be an early indicator of a wide range of infectious and other diseases.
The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients.
These models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones.
arXiv Detail & Related papers (2022-02-28T15:33:31Z) - Malignancy Prediction and Lesion Identification from Clinical
Dermatological Images [65.1629311281062]
We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images.
We first identify all lesions present in the image regardless of sub-type or likelihood of malignancy, then it estimates their likelihood of malignancy, and through aggregation, it also generates an image-level likelihood of malignancy.
arXiv Detail & Related papers (2021-04-02T20:52:05Z) - Alleviating the Incompatibility between Cross Entropy Loss and Episode
Training for Few-shot Skin Disease Classification [76.89093364969253]
We propose to apply Few-Shot Learning to skin disease identification to address the extreme scarcity of training sample problem.
Based on a detailed analysis, we propose the Query-Relative (QR) loss, which proves superior to Cross Entropy (CE) under episode training.
We further strengthen the proposed QR loss with a novel adaptive hard margin strategy.
arXiv Detail & Related papers (2020-04-21T00:57:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.