Skin Cancer Classification: Hybrid CNN-Transformer Models with KAN-Based Fusion
- URL: http://arxiv.org/abs/2508.12484v1
- Date: Sun, 17 Aug 2025 19:57:34 GMT
- Title: Skin Cancer Classification: Hybrid CNN-Transformer Models with KAN-Based Fusion
- Authors: Shubhi Agarwal, Amulya Kumar Mahto,
- Abstract summary: We explore Sequential and Parallel Hybrid CNN-Transformer models with Convolutional Kolmogorov-Arnold Network (CKAN)<n>Our approach integrates transfer learning and extensive data augmentation, where CNNs extract local spatial features, Transformers model global dependencies, and CKAN facilitates nonlinear feature fusion for improved representation learning.<n>Our proposed approach achieves competitive performance in skin cancer classification, demonstrating 92.81% accuracy and 92.47% F1-score on the HAM10000 dataset, 97.83% accuracy and 97.83% F1-score on the PAD-UFES dataset, and 91.17% accuracy with 91.79% F1- score on
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Skin cancer classification is a crucial task in medical image analysis, where precise differentiation between malignant and non-malignant lesions is essential for early diagnosis and treatment. In this study, we explore Sequential and Parallel Hybrid CNN-Transformer models with Convolutional Kolmogorov-Arnold Network (CKAN). Our approach integrates transfer learning and extensive data augmentation, where CNNs extract local spatial features, Transformers model global dependencies, and CKAN facilitates nonlinear feature fusion for improved representation learning. To assess generalization, we evaluate our models on multiple benchmark datasets (HAM10000,BCN20000 and PAD-UFES) under varying data distributions and class imbalances. Experimental results demonstrate that hybrid CNN-Transformer architectures effectively capture both spatial and contextual features, leading to improved classification performance. Additionally, the integration of CKAN enhances feature fusion through learnable activation functions, yielding more discriminative representations. Our proposed approach achieves competitive performance in skin cancer classification, demonstrating 92.81% accuracy and 92.47% F1-score on the HAM10000 dataset, 97.83% accuracy and 97.83% F1-score on the PAD-UFES dataset, and 91.17% accuracy with 91.79% F1- score on the BCN20000 dataset highlighting the effectiveness and generalizability of our model across diverse datasets. This study highlights the significance of feature representation and model design in advancing robust and accurate medical image classification.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - Enhancing Generalization in Sickle Cell Disease Diagnosis through Ensemble Methods and Feature Importance Analysis [2.8601718604194706]
This work presents a novel approach for selecting the optimal ensemble-based classification method and features.<n>It provides diagnostic support for Sickle Cell Disease using peripheral blood smear images of red blood cells.
arXiv Detail & Related papers (2026-01-19T13:00:41Z) - Enhanced Leukemic Cell Classification Using Attention-Based CNN and Data Augmentation [0.5371337604556311]
Acute lymphoblastic leukemia (ALL) is the most common childhood cancer, requiring expert microscopic diagnosis.<n>The proposed system integrates an attention-based convolutional neural network combining EfficientNetV2-B3 with Squeeze-and-Excitation mechanisms for automated ALL cell classification.<n>Our approach employs comprehensive data augmentation, focal loss for class imbalance, and patient-wise data splitting to ensure robust and reproducible evaluation.
arXiv Detail & Related papers (2026-01-03T01:24:11Z) - CoTCoNet: An Optimized Coupled Transformer-Convolutional Network with an Adaptive Graph Reconstruction for Leukemia Detection [0.3573481101204926]
We propose an optimized Coupled Transformer Convolutional Network (CoTCoNet) framework for the classification of leukemia.
Our framework captures comprehensive global features and scalable spatial patterns, enabling the identification of complex and large-scale hematological features.
It achieves remarkable accuracy and F1-Score rates of 0.9894 and 0.9893, respectively.
arXiv Detail & Related papers (2024-10-11T13:31:28Z) - Comparative Analysis and Ensemble Enhancement of Leading CNN Architectures for Breast Cancer Classification [0.0]
This study introduces a novel and accurate approach to breast cancer classification using histopathology images.
It systematically compares leading Convolutional Neural Network (CNN) models across varying image datasets.
Our findings establish the settings required to achieve exceptional classification accuracy for standalone CNN models.
arXiv Detail & Related papers (2024-10-04T11:31:43Z) - Classification of Endoscopy and Video Capsule Images using CNN-Transformer Model [1.0994755279455526]
This study proposes a hybrid model that combines the advantages of Transformers and Convolutional Neural Networks (CNNs) to enhance classification performance.
For the GastroVision dataset, our proposed model demonstrates excellent performance with Precision, Recall, F1 score, Accuracy, and Matthews Correlation Coefficient (MCC) of 0.8320, 0.8386, 0.8324, 0.8386, and 0.8191, respectively.
arXiv Detail & Related papers (2024-08-20T11:05:32Z) - Predictive Analytics of Varieties of Potatoes [2.336821989135698]
We explore the application of machine learning algorithms specifically to enhance the selection process of Russet potato clones in breeding trials.
This study addresses the challenge of efficiently identifying high-yield, disease-resistant, and climate-resilient potato varieties.
arXiv Detail & Related papers (2024-04-04T00:49:05Z) - CIMIL-CRC: a clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H\&E stained images [42.771819949806655]
We introduce CIMIL-CRC', a framework that solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches.
We assessed our CIMIL-CRC method using the average area under the curve (AUC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort.
arXiv Detail & Related papers (2024-01-29T12:56:11Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Classification of lung cancer subtypes on CT images with synthetic
pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images.
We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.