A Global Atlas of Digital Dermatology to Map Innovation and Disparities
- URL: http://arxiv.org/abs/2601.00840v1
- Date: Sat, 27 Dec 2025 09:22:36 GMT
- Title: A Global Atlas of Digital Dermatology to Map Innovation and Disparities
- Authors: Fabian Gröger, Simone Lionetti, Philippe Gottfrois, Alvaro Gonzalez-Jimenez, Lea Habermacher, Labelling Consortium, Ludovic Amruthalingam, Matthew Groh, Marc Pouly, Alexander A. Navarini,
- Abstract summary: We present SkinMap, a multi-modal framework for the first comprehensive audit of the field's entire data basis.<n>We unify the publicly available dermatology datasets into a single, queryable semantic atlas comprising more than 1.1 million images of skin conditions.
- Score: 38.74632415760177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The adoption of artificial intelligence in dermatology promises democratized access to healthcare, but model reliability depends on the quality and comprehensiveness of the data fueling these models. Despite rapid growth in publicly available dermatology images, the field lacks quantitative key performance indicators to measure whether new datasets expand clinical coverage or merely replicate what is already known. Here we present SkinMap, a multi-modal framework for the first comprehensive audit of the field's entire data basis. We unify the publicly available dermatology datasets into a single, queryable semantic atlas comprising more than 1.1 million images of skin conditions and quantify (i) informational novelty over time, (ii) dataset redundancy, and (iii) representation gaps across demographics and diagnoses. Despite exponential growth in dataset sizes, informational novelty across time has somewhat plateaued: Some clusters, such as common neoplasms on fair skin, are densely populated, while underrepresented skin types and many rare diseases remain unaddressed. We further identify structural gaps in coverage: Darker skin tones (Fitzpatrick V-VI) constitute only 5.8% of images and pediatric patients only 3.0%, while many rare diseases and phenotype combinations remain sparsely represented. SkinMap provides infrastructure to measure blind spots and steer strategic data acquisition toward undercovered regions of clinical space.
Related papers
- eSkinHealth: A Multimodal Dataset for Neglected Tropical Skin Diseases [29.76522627359553]
eSkinHealth is a novel dataset collected on-site in Cote d'Ivoire and Ghana.<n>It contains 5,623 images from 1,639 cases and encompasses 47 skin diseases.<n>eSkinHealth also includes semantic lesion masks, instance-specific visual captions, and clinical concepts.
arXiv Detail & Related papers (2025-08-26T02:24:49Z) - DermINO: Hybrid Pretraining for a Versatile Dermatology Foundation Model [92.66916452260553]
DermNIO is a versatile foundation model for dermatology.<n>It incorporates a novel hybrid pretraining framework that augments the self-supervised learning paradigm.<n>It consistently outperforms state-of-the-art models across a wide range of tasks.
arXiv Detail & Related papers (2025-08-17T00:41:39Z) - LesionGen: A Concept-Guided Diffusion Model for Dermatology Image Synthesis [4.789822624169502]
We introduce LesionGen, a clinically informed T2I-DPM framework for dermatology image synthesis.<n>LesionGen is trained on structured, concept-rich dermatological captions derived from expert annotations and pseudo-generated, concept-guided reports.<n>Our results demonstrate that models trained solely on our synthetic dataset achieve classification accuracy comparable to those trained on real images.
arXiv Detail & Related papers (2025-07-30T18:07:34Z) - DermaCon-IN: A Multi-concept Annotated Dermatological Image Dataset of Indian Skin Disorders for Clinical AI Research [3.3114401663331137]
DermaCon-IN is a prospectively curated dataset of over 5,450 clinical images from approximately 3,000 patients in South India.<n>Each image is annotated by board-certified dermatologists with over 240 distinct diagnoses, structured under a hierarchical, etiology-based taxonomy.<n>The dataset captures a wide spectrum of dermatologic conditions and tonal variation commonly seen in Indian outpatient care.
arXiv Detail & Related papers (2025-06-06T13:59:08Z) - Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology [20.650401805716744]
We present Derm1M, the first large-scale vision-language dataset for dermatology, comprising 1,029,761 image-text pairs.<n>To demonstrate Derm1M potential in advancing both AI research and clinical application, we pretrained a series of CLIP-like models, collectively called DermLIP, on this dataset.
arXiv Detail & Related papers (2025-03-19T05:30:01Z) - A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis [48.84443450990355]
Deep networks have achieved broad success in analyzing natural images, when applied to medical scans, they often fail in unexcepted situations.
We investigate this challenge and focus on model sensitivity to domain shifts, such as data sampled from different hospitals or data confounded by demographic variables such as sex, race, etc, in the context of chest X-rays and skin lesion images.
Taking inspiration from medical training, we propose giving deep networks a prior grounded in explicit medical knowledge communicated in natural language.
arXiv Detail & Related papers (2024-05-23T17:55:02Z) - Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary
Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information.
A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction.
The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z) - RudolfV: A Foundation Model by Pathologists for Pathologists [13.17203220753175]
We present a novel approach to designing foundation models for computational pathology.
Our model "RudolfV" surpasses existing state-of-the-art foundation models across different benchmarks.
arXiv Detail & Related papers (2024-01-08T18:31:38Z) - Generative models improve fairness of medical classifiers under
distribution shifts [49.10233060774818]
We show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.
We demonstrate that these learned augmentations can surpass ones by making models more robust and statistically fair in- and out-of-distribution.
arXiv Detail & Related papers (2023-04-18T18:15:38Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.