Quantization-Aware Neuromorphic Architecture for Efficient Skin Disease Classification on Resource-Constrained Devices
- URL: http://arxiv.org/abs/2507.15958v2
- Date: Thu, 23 Oct 2025 08:54:47 GMT
- Title: Quantization-Aware Neuromorphic Architecture for Efficient Skin Disease Classification on Resource-Constrained Devices
- Authors: Haitian Wang, Xinyu Wang, Yiren Wang, Zichen Geng, Xian Zhang, Yu Zhang, Bo Miao,
- Abstract summary: We introduce QANA, a novel quantization-aware neuromorphic architecture for incremental skin lesion classification on resource-limited hardware.<n>QANA integrates ghost modules, efficient channel attention, and squeeze-and-excitation blocks for robust feature representation.<n>Its quantization-aware head and spike-compatible transformations enable seamless conversion to spiking neural networks (SNNs) and deployment on neuromorphic platforms.
- Score: 8.61918204555282
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate and efficient skin lesion classification on edge devices is critical for accessible dermatological care but remains challenging due to computational, energy, and privacy constraints. We introduce QANA, a novel quantization-aware neuromorphic architecture for incremental skin lesion classification on resource-limited hardware. QANA effectively integrates ghost modules, efficient channel attention, and squeeze-and-excitation blocks for robust feature representation with low-latency and energy-efficient inference. Its quantization-aware head and spike-compatible transformations enable seamless conversion to spiking neural networks (SNNs) and deployment on neuromorphic platforms. Evaluation on the large-scale HAM10000 benchmark and a real-world clinical dataset shows that QANA achieves 91.6% Top-1 accuracy and 82.4% macro F1 on HAM10000, and 90.8%/81.7% on the clinical dataset, significantly outperforming state-of-the-art CNN-to-SNN models under fair comparison. Deployed on BrainChip Akida hardware, QANA achieves 1.5 ms inference latency and 1.7,mJ energy per image, reducing inference latency and energy use by over 94.6%/98.6% compared to GPU-based CNNs surpassing state-of-the-art CNN-to-SNN conversion baselines. These results demonstrate the effectiveness of QANA for accurate, real-time, and privacy-sensitive medical analysis in edge environments.
Related papers
- Stochastic Spiking Neuron Based SNN Can be Inherently Bayesian [0.033985395340995594]
Uncertainty in biological neural systems appears to be beneficial rather than detrimental.<n>In neuromorphic computing systems, device variability often limits performance, including accuracy and efficiency.<n>We propose a spiking neural network framework that unifies the dynamic models of intrinsic deviceity.
arXiv Detail & Related papers (2026-02-03T10:48:14Z) - Neuromorphic Eye Tracking for Low-Latency Pupil Detection [37.091037454305024]
Eye tracking for wearable systems demands low latency and milliwatt-level power.<n>Neuromorphic sensors and spiking neural networks (SNNs) offer a promising alternative.<n>This paper presents a neuromorphic version of top-performing event-based eye-tracking models.
arXiv Detail & Related papers (2025-12-10T11:30:21Z) - MD-SNN: Membrane Potential-aware Distillation on Quantized Spiking Neural Network [18.23285395499578]
Spiking Neural Networks (SNNs) offer a promising and energy-efficient alternative to conventional neural networks.<n>SNNs face challenges regarding memory and computation due to complex-temporal dynamics.<n>We introduce Membrane-aware Distillation on quantized Spiking Neural Network (MD-SNN)
arXiv Detail & Related papers (2025-12-04T04:27:19Z) - Benchmarking Quantum Convolutional Neural Networks for Signal Classification in Simulated Gamma-Ray Burst Detection [29.259008600842517]
This study evaluates the use of Quantum Convolutional Neural Networks (QCNNs) for identifying signals resembling Gamma-Ray Bursts (GRBs)<n>We implement a hybrid quantum-classical machine learning technique using the Qiskit framework, with the QCNNs trained on a quantum simulator.<n>QCNNs showed robust performance on time-series datasets, successfully detecting GRB signals with high precision.
arXiv Detail & Related papers (2025-01-28T16:07:12Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - SAN: Hypothesizing Long-Term Synaptic Development and Neural Engram Mechanism in Scalable Model's Parameter-Efficient Fine-Tuning [39.04674956382538]
We bridged the performance gap with Full Fine-Tuning (FFT) through sophisticated analysis of pre-trained parameter spaces.<n>We propose a method, Synapse and Neuron (SAN), which decomposes scaling components from anterior to posterior weight matrices.<n>Our approach is theoretically grounded in Long-Term Potentiation/Depression ( Neural/D) phenomena, which govern synapse development through neurotransmitter release.
arXiv Detail & Related papers (2024-08-24T03:27:29Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - The Hardware Impact of Quantization and Pruning for Weights in Spiking
Neural Networks [0.368986335765876]
quantization and pruning of parameters can both compress the model size, reduce memory footprints, and facilitate low-latency execution.
We study various combinations of pruning and quantization in isolation, cumulatively, and simultaneously to a state-of-the-art SNN targeting gesture recognition.
We show that this state-of-the-art model is amenable to aggressive parameter quantization, not suffering from any loss in accuracy down to ternary weights.
arXiv Detail & Related papers (2023-02-08T16:25:20Z) - Implementing a Hybrid Quantum-Classical Neural Network by Utilizing a
Variational Quantum Circuit for Detection of Dementia [0.0]
Nearly 1 in 3 patients with Alzheimer's were misdiagnosed in 2019, an issue neural networks can rectify.
This study found that the proposed hybrid quantum-classical convolutional neural network (QCCNN) provided 97.5% and 95.1% testing and validation accuracies.
QCCNN detected normal and demented images correctly 95% and 98% of the time, compared to the CNN accuracies of 89% and 91%.
arXiv Detail & Related papers (2023-01-29T18:05:42Z) - Masked Spiking Transformer [6.862877794199617]
Spiking Neural Networks (SNNs) and Transformers have attracted significant attention due to their potential for high energy efficiency and high-performance nature.
We propose to leverage the benefits of the ANN-to-SNN conversion method to combine SNNs and Transformers.
We introduce a novel Masked Spiking Transformer framework that incorporates a Random Spike Masking (RSM) method to prune redundant spikes and reduce energy consumption without sacrificing performance.
arXiv Detail & Related papers (2022-10-03T19:56:09Z) - Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised
Convolutional Neural Networks [25.160063477248904]
A convolutional neural network model is developed for detecting atrial fibrillation from electrocardiogram signals.
The model demonstrates high performance despite being trained on limited, variable-length input data.
The final model achieved a 91.1% model compression ratio while maintaining high model accuracy of 91.7% and less than 1% loss.
arXiv Detail & Related papers (2022-06-14T11:47:04Z) - Multistage Pruning of CNN Based ECG Classifiers for Edge Devices [9.223908421919733]
Convolutional neural network (CNN) based deep learning has been used successfully to detect anomalous beats in ECG.
The computational complexity of existing CNN models prohibits them from being implemented in low-powered edge devices.
This paper presents a novel multistage pruning technique that reduces CNN model complexity with negligible loss in performance.
arXiv Detail & Related papers (2021-08-31T17:51:15Z) - A New Neuromorphic Computing Approach for Epileptic Seizure Prediction [4.798958633851825]
CNNs are computationally expensive and power hungry.
Motivated by the energy-efficient spiking neural networks (SNNs), a neuromorphic computing approach for seizure prediction is proposed in this work.
arXiv Detail & Related papers (2021-02-25T10:39:18Z) - Block-term Tensor Neural Networks [29.442026567710435]
We show that block-term tensor layers (BT-layers) can be easily adapted to neural network models, such as CNNs and RNNs.
BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.
arXiv Detail & Related papers (2020-10-10T09:58:43Z) - PENNI: Pruned Kernel Sharing for Efficient CNN Inference [41.050335599000036]
State-of-the-art (SOTA) CNNs achieve outstanding performance on various tasks.
Their high computation demand and massive number of parameters make it difficult to deploy these SOTA CNNs onto resource-constrained devices.
We propose PENNI, a CNN model compression framework that is able to achieve model compactness and hardware efficiency simultaneously.
arXiv Detail & Related papers (2020-05-14T16:57:41Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.