Related papers: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices

Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices

URL: http://arxiv.org/abs/2512.08751v1
Date: Tue, 09 Dec 2025 16:01:48 GMT
Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices
Authors: Kuniko Paxton, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos,
Abstract summary: This study proposes a skewness-guided pruning method that selectively prunes the Multi-Head Self-Attention and Multi-Layer Perceptron layers of a multimodal Swin Transformer.<n>Experiments on the compact Swin Transformer demonstrate approximately 36% model size reduction with no loss in accuracy.
Score: 0.5133143243716792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, high-performance computer vision models have achieved remarkable success in medical imaging, with some skin lesion classification systems even surpassing dermatology specialists in diagnostic accuracy. However, such models are computationally intensive and large in size, making them unsuitable for deployment on edge devices. In addition, strict privacy constraints hinder centralized data management, motivating the adoption of Federated Learning (FL). To address these challenges, this study proposes a skewness-guided pruning method that selectively prunes the Multi-Head Self-Attention and Multi-Layer Perceptron layers of a multimodal Swin Transformer based on the statistical skewness of their output distributions. The proposed method was validated in a horizontal FL environment and shown to maintain performance while substantially reducing model complexity. Experiments on the compact Swin Transformer demonstrate approximately 36\% model size reduction with no loss in accuracy. These findings highlight the feasibility of achieving efficient model compression and privacy-preserving distributed learning for multimodal medical AI on edge devices.

Related papers

GRASP: Guided Residual Adapters with Sample-wise Partitioning [10.504309161945065]
We propose GRASP: Guided Residual Adapters with Sample-wise Partitioning.<n>On the long-tail MIMIC-CXR-LT dataset, GRASP yields superior FID and diversity metrics, especially for rare classes.
arXiv Detail & Related papers (2025-12-01T13:43:17Z)
MedLiteNet: Lightweight Hybrid Medical Image Segmentation Model [17.73370811236741]
We introduce the MedLiteNet, a lightweight CNN Transformer hybrid tailored for dermoscopic segmentation.<n>The encoder stacks depth-wise Mobile Inverted Bottleneck blocks to curb computation, inserts a bottleneck-level cross-scale token-mixing unit to exchange information between resolutions, and embeds a boundary-aware self-attention module to sharpen lesion contours.
arXiv Detail & Related papers (2025-09-03T05:59:13Z)
Flattening Singular Values of Factorized Convolution for Medical Images [2.41019965808244]
Convolutional neural networks (CNNs) have long been the paradigm of choice for robust medical image processing (MIP) Many methods employ factorized convolutional layers to alleviate the burden of limited computational resources. We propose a Singular value equalization generalizer-induced Factorized Convolution (SFConv) to improve the expressive power of factorized convolutions in MIP models.
arXiv Detail & Related papers (2024-03-01T15:30:50Z)
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles [4.249986624493547]
Once deployed, medical image analysis methods are often faced with unexpected image corruptions and noise perturbations.<n>LaDiNE is a novel ensemble learning method combining the robustness of Vision Transformers with diffusion-based generative models.<n>Experiments on tuberculosis chest X-rays and melanoma skin cancer datasets demonstrate that LaDiNE achieves superior performance compared to a wide range of state-of-the-art methods.
arXiv Detail & Related papers (2023-10-24T15:53:07Z)
Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving. Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees. In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance [114.1541203743303]
We propose PLATON, which captures the uncertainty of importance scores by upper confidence bound (UCB) of importance estimation. We conduct extensive experiments with several Transformer-based models on natural language understanding, question answering and image classification.
arXiv Detail & Related papers (2022-06-25T05:38:39Z)
A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling. We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk. We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification [5.642359877598896]
This paper proposes a novel data augmentation strategy for single model classification of skin lesions based on a small and imbalanced dataset. Various DCNNs are trained on this dataset to show that the models with moderate complexity outperform the larger models. By combining Modified RandAugment and Multi-weighted Focal Loss in a single DCNN model, we have achieved the classification accuracy comparable to those of multiple ensembling models on the ISIC 2018 challenge test dataset.
arXiv Detail & Related papers (2021-02-02T03:48:55Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.