A Novel Perspective for Multi-modal Multi-label Skin Lesion Classification
- URL: http://arxiv.org/abs/2409.12390v1
- Date: Thu, 19 Sep 2024 01:31:38 GMT
- Title: A Novel Perspective for Multi-modal Multi-label Skin Lesion Classification
- Authors: Yuan Zhang, Yutong Xie, Hu Wang, Jodie C Avery, M Louise Hull, Gustavo Carneiro,
- Abstract summary: This paper introduces the Skin Lesion, utilizing a Multi-modal Multi-label TransFormer-based model (SkinM2Former)
SkinM2Former achieves a mean average accuracy of 77.27% and a mean diagnostic accuracy of 77.85% on the public Derm7pt dataset, outperforming state-of-the-art (SOTA) methods.
- Score: 20.05980794886644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The efficacy of deep learning-based Computer-Aided Diagnosis (CAD) methods for skin diseases relies on analyzing multiple data modalities (i.e., clinical+dermoscopic images, and patient metadata) and addressing the challenges of multi-label classification. Current approaches tend to rely on limited multi-modal techniques and treat the multi-label problem as a multiple multi-class problem, overlooking issues related to imbalanced learning and multi-label correlation. This paper introduces the innovative Skin Lesion Classifier, utilizing a Multi-modal Multi-label TransFormer-based model (SkinM2Former). For multi-modal analysis, we introduce the Tri-Modal Cross-attention Transformer (TMCT) that fuses the three image and metadata modalities at various feature levels of a transformer encoder. For multi-label classification, we introduce a multi-head attention (MHA) module to learn multi-label correlations, complemented by an optimisation that handles multi-label and imbalanced learning problems. SkinM2Former achieves a mean average accuracy of 77.27% and a mean diagnostic accuracy of 77.85% on the public Derm7pt dataset, outperforming state-of-the-art (SOTA) methods.
Related papers
- UNICORN: A Deep Learning Model for Integrating Multi-Stain Data in Histopathology [2.9389205138207277]
UNICORN is a multi-modal transformer capable of processing multi-stain histopathology for atherosclerosis severity class prediction.
The architecture comprises a two-stage, end-to-end trainable model with specialized modules utilizing transformer self-attention blocks.
UNICORN achieved a classification accuracy of 0.67, outperforming other state-of-the-art models.
arXiv Detail & Related papers (2024-09-26T12:13:52Z) - Automated Ensemble Multimodal Machine Learning for Healthcare [52.500923923797835]
We introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning.
AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies.
arXiv Detail & Related papers (2024-07-25T17:46:38Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Self-Supervised Multi-Modality Learning for Multi-Label Skin Lesion
Classification [15.757141597485374]
We propose a self-supervised learning algorithm for multi-modality skin lesion classification.
Our algorithm enables the multi-modality learning by maximizing the similarities between paired dermoscopic and clinical images.
Our results show our algorithm achieved better performances than other state-of-the-art SSL counterparts.
arXiv Detail & Related papers (2023-10-28T04:16:08Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Transformer-based interpretable multi-modal data fusion for skin lesion
classification [0.40964539027092917]
In skin lesion classification in dermatology, deep learning systems are still in their infancy due to the limited transparency of their decision-making process.
Our method beats other state-of-the-art single- and multi-modal DL architectures in image-rich and patient-data-rich environments.
arXiv Detail & Related papers (2023-04-03T11:45:27Z) - Reliable Representations Learning for Incomplete Multi-View Partial Multi-Label Classification [78.15629210659516]
In this paper, we propose an incomplete multi-view partial multi-label classification network named RANK.
We break through the view-level weights inherent in existing methods and propose a quality-aware sub-network to dynamically assign quality scores to each view of each sample.
Our model is not only able to handle complete multi-view multi-label datasets, but also works on datasets with missing instances and labels.
arXiv Detail & Related papers (2023-03-30T03:09:25Z) - Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging
Diverse Data for More Accurate Diagnosis [0.15776842283814416]
We present a new technical approach of "learnable synergies"
Our approach is easily scalable and naturally adapts to multimodal data inputs from clinical routine.
It outperforms state-of-the-art models in clinically relevant diagnosis tasks.
arXiv Detail & Related papers (2022-12-18T20:43:37Z) - Tensor-Based Multi-Modality Feature Selection and Regression for
Alzheimer's Disease Diagnosis [25.958167380664083]
We propose a novel tensor-based multi-modality feature selection and regression method for diagnosis and biomarker identification of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI)
We present the practical advantages of our method for the analysis of ADNI data using three imaging modalities.
arXiv Detail & Related papers (2022-09-23T02:17:27Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.