Related papers: Robustness and sex differences in skin cancer detection: logistic regression vs CNNs

Robustness and sex differences in skin cancer detection: logistic regression vs CNNs

URL: http://arxiv.org/abs/2504.11415v1
Date: Tue, 15 Apr 2025 17:31:46 GMT
Title: Robustness and sex differences in skin cancer detection: logistic regression vs CNNs
Authors: Nikolette Pedersen, Regitze Sydendal, Andreas Wulff, Ralf Raumanns, Eike Petersen, Veronika Cheplygina,
Abstract summary: This study is a replication of a study on Alzheimer's disease which studied robustness of logistic regression (LR) and convolutional neural networks (CNN) across patient sexes.<n>We evaluate these models in alignment with [28]: across multiple training datasets with varied sex composition to determine their robustness.<n>Our results show that both the LR and the CNN were robust to the sex distributions, but the results also revealed that the CNN had a significantly higher accuracy (ACC) and area under the receiver operating characteristics (AUROC) for male patients than for female patients.
Score: 1.758593528245578
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning has been reported to achieve high performances in the detection of skin cancer, yet many challenges regarding the reproducibility of results and biases remain. This study is a replication (different data, same analysis) of a study on Alzheimer's disease [28] which studied robustness of logistic regression (LR) and convolutional neural networks (CNN) across patient sexes. We explore sex bias in skin cancer detection, using the PAD-UFES-20 dataset with LR trained on handcrafted features reflecting dermatological guidelines (ABCDE and the 7-point checklist), and a pre-trained ResNet-50 model. We evaluate these models in alignment with [28]: across multiple training datasets with varied sex composition to determine their robustness. Our results show that both the LR and the CNN were robust to the sex distributions, but the results also revealed that the CNN had a significantly higher accuracy (ACC) and area under the receiver operating characteristics (AUROC) for male patients than for female patients. We hope these findings to contribute to the growing field of investigating potential bias in popular medical machine learning methods. The data and relevant scripts to reproduce our results can be found in our Github.

Related papers

Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z)
SpurBreast: A Curated Dataset for Investigating Spurious Correlations in Real-world Breast MRI Classification [0.4999814847776096]
We introduce SpurBreast, a curated breast MRI dataset that intentionally incorporates spurious correlations to evaluate their impact on model performance.<n>We analyze over 100 features involving patient, device, and imaging protocol, and identify two dominant spurious signals: magnetic field strength and image orientation.<n>Through controlled dataset splits, we demonstrate that DNNs can exploit these non-clinical signals, achieving high validation accuracy while failing to generalize to unbiased test data.
arXiv Detail & Related papers (2025-10-02T15:16:20Z)
Demographic Predictability in 3D CT Foundation Embeddings [0.0]
Self-supervised foundation models have been successfully extended to encode 3D computed tomography (CT) images.<n>We evaluate whether these embeddings capture demographic information, such as age, sex, or race.
arXiv Detail & Related papers (2024-11-28T04:26:39Z)
Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.<n>This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z)
A study on Deep Convolutional Neural Networks, Transfer Learning and Ensemble Model for Breast Cancer Detection [2.5748316361772963]
This study compares the performance of D-CNN, which includes the original CNN, transfer learning, and an ensemble model, in detecting breast cancer. The ensemble model provides the highest detection and classification accuracy of 99.94% for breast cancer detection and classification. The high accuracy in detecting and categorising breast cancer detection using CNN suggests that the CNN model is promising in breast cancer disease detection.
arXiv Detail & Related papers (2024-09-10T17:58:21Z)
Dataset Distribution Impacts Model Fairness: Single vs. Multi-Task Learning [2.9530211066840417]
We evaluate the performance of skin lesion classification using ResNet-based CNNs.<n>We present a linear programming method for generating datasets with varying patient sex and class labels.
arXiv Detail & Related papers (2024-07-24T15:23:26Z)
Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI [2.3149142745203326]
Skin diseases affect over a third of the global population, yet their impact is often underestimated. Deep learning techniques have shown much promise for various tasks, including dermatological disease identification. This study uses a skin disease dataset with 31 classes and compares it with all versions of Vision Transformers, Swin Transformers and DivoV2.
arXiv Detail & Related papers (2024-07-20T05:38:00Z)
Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z)
Cancer-Net PCa-Gen: Synthesis of Realistic Prostate Diffusion Weighted Imaging Data via Anatomic-Conditional Controlled Latent Diffusion [68.45407109385306]
In Canada, prostate cancer is the most common form of cancer in men and accounted for 20% of new cancer cases for this demographic in 2022. There has been significant interest in the development of deep neural networks for prostate cancer diagnosis, prognosis, and treatment planning using diffusion weighted imaging (DWI) data. In this study, we explore the efficacy of latent diffusion for generating realistic prostate DWI data through the introduction of an anatomic-conditional controlled latent diffusion strategy.
arXiv Detail & Related papers (2023-11-30T15:11:03Z)
Studying the Effects of Sex-related Differences on Brain Age Prediction using brain MR Imaging [0.3958317527488534]
We study biases related to sex when developing a machine learning model based on brain magnetic resonance images (MRI) We investigate the effects of sex by performing brain age prediction considering different experimental designs. We found disparities in the performance of brain age prediction models when trained on distinct sex subgroups and datasets.
arXiv Detail & Related papers (2023-10-17T20:55:53Z)
Feature robustness and sex differences in medical imaging: a case study in MRI-based Alzheimer's disease detection [1.7616042687330637]
We compare two classification schemes on the ADNI MRI dataset. We do not find a strong dependence of model performance for male and female test subjects on the sex composition of the training dataset.
arXiv Detail & Related papers (2022-04-04T17:37:54Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Deep learning-based COVID-19 pneumonia classification using chest CT images: model generalizability [54.86482395312936]
Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries. We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
arXiv Detail & Related papers (2021-02-18T21:14:52Z)
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands. We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.