Evaluating Machine Learning-based Skin Cancer Diagnosis
- URL: http://arxiv.org/abs/2409.03794v1
- Date: Wed, 4 Sep 2024 02:44:48 GMT
- Title: Evaluating Machine Learning-based Skin Cancer Diagnosis
- Authors: Tanish Jain,
- Abstract summary: The research assesses two convolutional neural network architectures: a MobileNet-based model and a custom CNN model.
Both models are evaluated for their ability to classify skin lesions into seven categories and to distinguish between dangerous and benign lesions.
The study concludes that while the models show promise in explainability, further development is needed to ensure fairness across different skin tones.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study evaluates the reliability of two deep learning models for skin cancer detection, focusing on their explainability and fairness. Using the HAM10000 dataset of dermatoscopic images, the research assesses two convolutional neural network architectures: a MobileNet-based model and a custom CNN model. Both models are evaluated for their ability to classify skin lesions into seven categories and to distinguish between dangerous and benign lesions. Explainability is assessed using Saliency Maps and Integrated Gradients, with results interpreted by a dermatologist. The study finds that both models generally highlight relevant features for most lesion types, although they struggle with certain classes like seborrheic keratoses and vascular lesions. Fairness is evaluated using the Equalized Odds metric across sex and skin tone groups. While both models demonstrate fairness across sex groups, they show significant disparities in false positive and false negative rates between light and dark skin tones. A Calibrated Equalized Odds postprocessing strategy is applied to mitigate these disparities, resulting in improved fairness, particularly in reducing false negative rate differences. The study concludes that while the models show promise in explainability, further development is needed to ensure fairness across different skin tones. These findings underscore the importance of rigorous evaluation of AI models in medical applications, particularly in diverse population groups.
Related papers
- FairSkin: Fair Diffusion for Skin Disease Image Generation [54.29840149709033]
Diffusion Model (DM) has become a leading method in generating synthetic medical images, but it suffers from a critical twofold bias.
We propose FairSkin, a novel DM framework that mitigates these biases through a three-level resampling mechanism.
Our approach significantly improves the diversity and quality of generated images, contributing to more equitable skin disease detection in clinical settings.
arXiv Detail & Related papers (2024-10-29T21:37:03Z) - DDI-CoCo: A Dataset For Understanding The Effect Of Color Contrast In
Machine-Assisted Skin Disease Detection [51.92255321684027]
We study the interaction between skin tone and color difference effects and suggest that color difference can be an additional reason behind model performance bias between skin tones.
Our work provides a complementary angle to dermatology AI for improving skin disease detection.
arXiv Detail & Related papers (2024-01-24T07:45:24Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - CIRCLe: Color Invariant Representation Learning for Unbiased
Classification of Skin Lesions [16.65329510916639]
We propose CIRCLe, a skin color invariant deep representation learning method for improving fairness in skin lesion classification.
We demonstrate CIRCLe's superior performance over the state-of-the-art when evaluated on 16k+ images spanning 6 Fitzpatrick skin types and 114 diseases.
arXiv Detail & Related papers (2022-08-29T12:06:10Z) - FairDisCo: Fairer AI in Dermatology via Disentanglement Contrastive
Learning [11.883809920936619]
We propose FairDisCo, a disentanglement deep learning framework with contrastive learning.
We compare FairDisCo to three fairness methods, namely, resampling, reweighting, and attribute-aware.
We adapt two fairness-based metrics DPM and EOM for our multiple classes and sensitive attributes task, highlighting the skin-type bias in skin lesion classification.
arXiv Detail & Related papers (2022-08-22T01:54:23Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - EdgeMixup: Improving Fairness for Skin Disease Classification and
Segmentation [9.750368551427494]
Skin lesions can be an early indicator of a wide range of infectious and other diseases.
The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients.
These models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones.
arXiv Detail & Related papers (2022-02-28T15:33:31Z) - SuperCon: Supervised Contrastive Learning for Imbalanced Skin Lesion
Classification [9.265557367859637]
SuperCon is a two-stage training strategy to overcome the class imbalance problem on skin lesion classification.
Our two-stage training strategy effectively addresses the class imbalance classification problem, and significantly improves existing works in terms of F1-score and AUC score.
arXiv Detail & Related papers (2022-02-11T15:19:36Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Analysis of skin lesion images with deep learning [0.0]
We evaluate the current state of the art in the classification of dermoscopic images.
Various deep neural network architectures pre-trained on the ImageNet data set are adapted to a combined training data set.
The performance and applicability of these models for the detection of eight classes of skin lesions are examined.
arXiv Detail & Related papers (2021-01-11T10:58:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.