Related papers: PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks

PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks

URL: http://arxiv.org/abs/2501.07032v4
Date: Tue, 11 Feb 2025 06:47:56 GMT
Title: PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks
Authors: Hoang-Thang Ta, Duy-Quy Thai, Anh Tran, Grigori Sidorov, Alexander Gelbukh,
Abstract summary: Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures.<n>KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers.<n>This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
Score: 47.947045173329315
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures, offering a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformers. By advancing network design, KANs drive groundbreaking research and enable transformative applications across various scientific domains involving neural networks. However, existing KANs often require significantly more parameters in their network layers than MLPs. To address this limitation, this paper introduces PRKANs (Parameter-Reduced Kolmogorov-Arnold Networks), which employ several methods to reduce the parameter count in KAN layers, making them comparable to MLP layers. Experimental results on the MNIST and Fashion-MNIST datasets demonstrate that PRKANs outperform several existing KANs, and their variant with attention mechanisms rivals the performance of MLPs, albeit with slightly longer training times. Furthermore, the study highlights the advantages of Gaussian Radial Basis Functions (GRBFs) and layer normalization in KAN designs. The repository for this work is available at: https://github.com/hoangthangta/All-KAN.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Kolmogorov-Arnold Network Autoencoders [0.0]
Kolmogorov-Arnold Networks (KANs) are promising alternatives to Multi-Layer Perceptrons (MLPs) KANs align closely with the Kolmogorov-Arnold representation theorem, potentially enhancing both model accuracy and interpretability. Our results demonstrate that KAN-based autoencoders achieve competitive performance in terms of reconstruction accuracy.
arXiv Detail & Related papers (2024-10-02T22:56:00Z)
A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks [43.70716358136333]
Kolmogorov- Networks (KAN) are based on a fundamentally different mathematical framework. KANs address several major issues insio, such as forgetting in continual learning scenarios. We extend the investigation by evaluating the performance of KANs in continual learning tasks within computer vision.
arXiv Detail & Related papers (2024-09-20T14:49:21Z)
Convolutional Kolmogorov-Arnold Networks [41.94295877935867]
We present Convolutional Kolmogorov-Arnold Networks (KANs) KANs replace traditional fixed-weight kernels with learnable non-linear functions. We empirically evaluate Convolutional KANs on the Fashion-MNIST dataset, demonstrating competitive accuracy with up to 50% fewer parameters compared to baseline CNNs.
arXiv Detail & Related papers (2024-06-19T02:09:44Z)
TKAN: Temporal Kolmogorov-Arnold Networks [0.0]
Long Short-Term Memory (LSTM) has demonstrated its ability to capture long-term dependencies in sequential data.<n>Inspired by the Kolmogorov-Arnold Networks (KANs) a promising alternatives to Multi-Layer Perceptrons (MLPs)<n>We propose a new neural networks architecture inspired by KAN and the LSTM, the Temporal Kolomogorov-Arnold Networks (TKANs)
arXiv Detail & Related papers (2024-05-12T17:40:48Z)
Topological Representations of Heterogeneous Learning Dynamics of Recurrent Spiking Neural Networks [16.60622265961373]
Spiking Neural Networks (SNNs) have become an essential paradigm in neuroscience and artificial intelligence. Recent advances in literature have studied the network representations of deep neural networks.
arXiv Detail & Related papers (2024-03-19T05:37:26Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction. Our approach is capable of encoding neural networks in a model zoo of mixed architecture. We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM) Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z)
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP [121.35904748477421]
Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision. Transformer and multi-layer perceptron (MLP)-based models, such as Vision Transformer and Vision-Mixer, started to lead new trends. In this paper, we conduct empirical studies on these DNN structures and try to understand their respective pros and cons.
arXiv Detail & Related papers (2021-08-30T06:09:02Z)
Exploiting Heterogeneity in Operational Neural Networks by Synaptic Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.