QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
- URL: http://arxiv.org/abs/2511.18689v1
- Date: Mon, 24 Nov 2025 02:05:16 GMT
- Title: QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
- Authors: Kazi Ahmed Asif Fuad, Lizhong Chen,
- Abstract summary: Kolmogorov Arnold Networks (KANs) replace linear transformations with spline-based function approximations distributed along network edges.<n>KANs offer strong expressivity and interpretability, but their heterogeneous spline and base branch parameters hinder efficient quantization.<n>We present QuantKAN, a unified framework for quantizing KANs across both quantization aware training (QAT) and post-training quantization regimes.
- Score: 6.860988566886594
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Kolmogorov Arnold Networks (KANs) represent a new class of neural architectures that replace conventional linear transformations and node-based nonlinearities with spline-based function approximations distributed along network edges. Although KANs offer strong expressivity and interpretability, their heterogeneous spline and base branch parameters hinder efficient quantization, which remains unexamined compared to CNNs and Transformers. In this paper, we present QuantKAN, a unified framework for quantizing KANs across both quantization aware training (QAT) and post-training quantization (PTQ) regimes. QuantKAN extends modern quantization algorithms, such as LSQ, LSQ+, PACT, DoReFa, QIL, GPTQ, BRECQ, AdaRound, AWQ, and HAWQ-V2, to spline based layers with branch-specific quantizers for base, spline, and activation components. Through extensive experiments on MNIST, CIFAR 10, and CIFAR 100 across multiple KAN variants (EfficientKAN, FastKAN, PyKAN, and KAGN), we establish the first systematic benchmarks for low-bit spline networks. Our results show that KANs, particularly deeper KAGN variants, are compatible with low-bit quantization but exhibit strong method architecture interactions: LSQ, LSQ+, and PACT preserve near full precision accuracy at 4 bit for shallow KAN MLP and ConvNet models, while DoReFa provides the most stable behavior for deeper KAGN under aggressive low-bit settings. For PTQ, GPTQ and Uniform consistently deliver the strongest overall performance across datasets, with BRECQ highly competitive on simpler regimes such as MNIST. Our proposed QuantKAN framework thus unifies spline learning and quantization, and provides practical tools and guidelines for efficiently deploying KANs in real-world, resource-constrained environments.
Related papers
- Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation [47.791962198275066]
Post-Training Quantization (PTQ) reduces the memory footprint and computational overhead of deep neural networks by converting full-precision (FP) values into quantized and compressed data types.<n>While PTQ is more cost-efficient than Quantization-Aware Training (QAT), it is highly susceptible to accuracy degradation under a low-bit quantization regime.<n>We propose Cluster-based Affine Transformation (CAT), an error-reduction framework that employs cluster-specific parameters to align LQ outputs with FP counterparts.
arXiv Detail & Related papers (2025-09-30T14:00:28Z) - VQC-MLPNet: An Unconventional Hybrid Quantum-Classical Architecture for Scalable and Robust Quantum Machine Learning [50.95799256262098]
Variational quantum circuits (VQCs) hold promise for quantum machine learning but face challenges in expressivity, trainability, and noise resilience.<n>We propose VQC-MLPNet, a hybrid architecture where a VQC generates the first-layer weights of a classical multilayer perceptron during training, while inference is performed entirely classically.
arXiv Detail & Related papers (2025-06-12T01:38:15Z) - Enhanced Variational Quantum Kolmogorov-Arnold Network [0.0]
Kolmogorov-Arnold Network (KAN) is a novel multi-layer network model recognized for its efficiency in neuromorphic computing.<n>KAN can be implemented on quantum computers using block encoding and Quantum Signal Processing, but these methods require fault-tolerant quantum devices.<n>We propose the Enhanced Variational Quantum Kolmogorov-Arnold Network (EVQKAN) to overcome this limitation.
arXiv Detail & Related papers (2025-03-28T16:47:42Z) - SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks [1.0923877073891446]
Spiking neural networks (SNNs) share the goal of enhancing efficiency, but adopt an 'event-driven' approach to reduce the power consumption of neural network inference.
This paper introduces two QAT schemes for stateful neurons: (i) a uniform quantization strategy, an established method for weight quantization, and (ii) threshold-centered quantization.
Our results show that increasing the density of quantization levels around the firing threshold improves accuracy across several benchmark datasets.
arXiv Detail & Related papers (2024-04-15T03:07:16Z) - Quantized Approximately Orthogonal Recurrent Neural Networks [6.524758376347808]
We explore the quantization of the weight matrices in ORNNs, leading to Quantized approximately Orthogonal RNNs (QORNNs)
We propose and investigate two strategies to learn QORNN by combining quantization-aware training (QAT) and computation projections.
The most efficient models achieve results similar to state-of-the-art full-precision ORNN, LSTM and FastRNN on a variety of standard benchmarks, even with 4-bits quantization.
arXiv Detail & Related papers (2024-02-05T09:59:57Z) - EQ-Net: Elastic Quantization Neural Networks [15.289359357583079]
Elastic Quantization Neural Networks (EQ-Net) aims to train a robust weight-sharing quantization supernet.
We propose an elastic quantization space (including elastic bit-width, granularity, and symmetry) to adapt to various mainstream quantitative forms.
We incorporate genetic algorithms and the proposed Conditional Quantization-Aware Conditional Accuracy Predictor (CQAP) as an estimator to quickly search mixed-precision quantized neural networks in supernet.
arXiv Detail & Related papers (2023-08-15T08:57:03Z) - Designing strong baselines for ternary neural network quantization
through support and mass equalization [7.971065005161565]
Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision.
This computational burden can be dramatically reduced by quantizing floating point values to ternary values.
We show experimentally that our approach allows to significantly improve the performance of ternary quantization through a variety of scenarios.
arXiv Detail & Related papers (2023-06-30T07:35:07Z) - Scaling Limits of Quantum Repeater Networks [62.75241407271626]
Quantum networks (QNs) are a promising platform for secure communications, enhanced sensing, and efficient distributed quantum computing.
Due to the fragile nature of quantum states, these networks face significant challenges in terms of scalability.
In this paper, the scaling limits of quantum repeater networks (QRNs) are analyzed.
arXiv Detail & Related papers (2023-05-15T14:57:01Z) - Distribution-Flexible Subset Quantization for Post-Quantizing
Super-Resolution Networks [68.83451203841624]
This paper introduces Distribution-Flexible Subset Quantization (DFSQ), a post-training quantization method for super-resolution networks.
DFSQ conducts channel-wise normalization of the activations and applies distribution-flexible subset quantization (SQ)
It achieves comparable performance to full-precision counterparts on 6- and 8-bit quantization, and incurs only a 0.1 dB PSNR drop on 4-bit quantization.
arXiv Detail & Related papers (2023-05-10T04:19:11Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - BRECQ: Pushing the Limit of Post-Training Quantization by Block
Reconstruction [29.040991149922615]
We study the challenging task of neural network quantization without end-to-end retraining, called Post-training Quantization (PTQ)
We propose a novel PTQ framework, dubbed BRECQ, which pushes the limits of bitwidth in PTQ down to INT2 for the first time.
For the first time we prove that, without bells and whistles, PTQ can attain 4-bit ResNet and MobileNetV2 comparable with QAT and enjoy 240 times faster production of quantized models.
arXiv Detail & Related papers (2021-02-10T13:46:16Z) - Once Quantization-Aware Training: High Performance Extremely Low-bit
Architecture Search [112.05977301976613]
We propose to combine Network Architecture Search methods with quantization to enjoy the merits of the two sides.
We first propose the joint training of architecture and quantization with a shared step size to acquire a large number of quantized models.
Then a bit-inheritance scheme is introduced to transfer the quantized models to the lower bit, which further reduces the time cost and improves the quantization accuracy.
arXiv Detail & Related papers (2020-10-09T03:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.