Reliability and Validity of Image-Based and Self-Reported Skin Phenotype
Metrics
- URL: http://arxiv.org/abs/2106.11240v1
- Date: Fri, 18 Jun 2021 16:12:24 GMT
- Title: Reliability and Validity of Image-Based and Self-Reported Skin Phenotype
Metrics
- Authors: John J. Howard, Yevgeniy B. Sirotin, Jerry L. Tipton, and Arun R.
Vemury
- Abstract summary: We show that measures of skin-tone for biometric performance evaluations must come from objective, characterized, and controlled sources.
Results demonstrate that measures of skin-tone for biometric performance evaluations must come from objective, characterized, and controlled sources.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With increasing adoption of face recognition systems, it is important to
ensure adequate performance of these technologies across demographic groups.
Recently, phenotypes such as skin-tone, have been proposed as superior
alternatives to traditional race categories when exploring performance
differentials. However, there is little consensus regarding how to
appropriately measure skin-tone in evaluations of biometric performance or in
AI more broadly. In this study, we explore the relationship between
face-area-lightness-measures (FALMs) estimated from images and ground-truth
skin readings collected using a device designed to measure human skin. FALMs
estimated from different images of the same individual varied significantly
relative to ground-truth FALM. This variation was only reduced by greater
control of acquisition (camera, background, and environment). Next, we compare
ground-truth FALM to Fitzpatrick Skin Types (FST) categories obtained using the
standard, in-person, medical survey and show FST is poorly predictive of
skin-tone. Finally, we show how noisy estimation of FALM leads to errors
selecting explanatory factors for demographic differentials. These results
demonstrate that measures of skin-tone for biometric performance evaluations
must come from objective, characterized, and controlled sources. Further,
despite this being a currently practiced approach, estimating FST categories
and FALMs from uncontrolled imagery does not provide an appropriate measure of
skin-tone.
Related papers
- Evaluating Machine Learning-based Skin Cancer Diagnosis [0.0]
The research assesses two convolutional neural network architectures: a MobileNet-based model and a custom CNN model.
Both models are evaluated for their ability to classify skin lesions into seven categories and to distinguish between dangerous and benign lesions.
The study concludes that while the models show promise in explainability, further development is needed to ensure fairness across different skin tones.
arXiv Detail & Related papers (2024-09-04T02:44:48Z) - Fairness Under Cover: Evaluating the Impact of Occlusions on Demographic Bias in Facial Recognition [0.0]
We evaluate the effect on the performance of face recognition models trained on the BUPT-Balanced and BUPT-GlobalFace datasets.
We propose a new metric, Face Occlusion Impact Ratio (FOIR), that quantifies the extent to which occlusions affect model performance across different demographic groups.
arXiv Detail & Related papers (2024-08-19T17:34:19Z) - Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary
Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information.
A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction.
The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z) - DDI-CoCo: A Dataset For Understanding The Effect Of Color Contrast In
Machine-Assisted Skin Disease Detection [51.92255321684027]
We study the interaction between skin tone and color difference effects and suggest that color difference can be an additional reason behind model performance bias between skin tones.
Our work provides a complementary angle to dermatology AI for improving skin disease detection.
arXiv Detail & Related papers (2024-01-24T07:45:24Z) - CIRCLe: Color Invariant Representation Learning for Unbiased
Classification of Skin Lesions [16.65329510916639]
We propose CIRCLe, a skin color invariant deep representation learning method for improving fairness in skin lesion classification.
We demonstrate CIRCLe's superior performance over the state-of-the-art when evaluated on 16k+ images spanning 6 Fitzpatrick skin types and 114 diseases.
arXiv Detail & Related papers (2022-08-29T12:06:10Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Treatment Learning Causal Transformer for Noisy Image Classification [62.639851972495094]
In this work, we incorporate this binary information of "existence of noise" as treatment into image classification tasks to improve prediction accuracy.
Motivated from causal variational inference, we propose a transformer-based architecture, that uses a latent generative model to estimate robust feature representations for noise image classification.
We also create new noisy image datasets incorporating a wide range of noise factors for performance benchmarking.
arXiv Detail & Related papers (2022-03-29T13:07:53Z) - EdgeMixup: Improving Fairness for Skin Disease Classification and
Segmentation [9.750368551427494]
Skin lesions can be an early indicator of a wide range of infectious and other diseases.
The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients.
These models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones.
arXiv Detail & Related papers (2022-02-28T15:33:31Z) - Unsupervised Learning Facial Parameter Regressor for Action Unit
Intensity Estimation via Differentiable Renderer [51.926868759681014]
We present a framework to predict the facial parameters based on a bone-driven face model (BDFM) under different views.
The proposed framework consists of a feature extractor, a generator, and a facial parameter regressor.
arXiv Detail & Related papers (2020-08-20T09:49:13Z) - Alleviating the Incompatibility between Cross Entropy Loss and Episode
Training for Few-shot Skin Disease Classification [76.89093364969253]
We propose to apply Few-Shot Learning to skin disease identification to address the extreme scarcity of training sample problem.
Based on a detailed analysis, we propose the Query-Relative (QR) loss, which proves superior to Cross Entropy (CE) under episode training.
We further strengthen the proposed QR loss with a novel adaptive hard margin strategy.
arXiv Detail & Related papers (2020-04-21T00:57:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.