LArctan-SKAN: Simple and Efficient Single-Parameterized Kolmogorov-Arnold Networks using Learnable Trigonometric Function
- URL: http://arxiv.org/abs/2410.19360v1
- Date: Fri, 25 Oct 2024 07:41:56 GMT
- Title: LArctan-SKAN: Simple and Efficient Single-Parameterized Kolmogorov-Arnold Networks using Learnable Trigonometric Function
- Authors: Zhijie Chen, Xinglin Zhang,
- Abstract summary: Three new SKAN variants are developed: LSin-SKAN, LCos-SKAN, and LArctan-SKAN.
LArctan-SKAN excels in both accuracy and computational efficiency.
Results confirm the effectiveness and potential of SKANs constructed with trigonometric functions.
- Score: 4.198997497722401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel approach for designing Single-Parameterized Kolmogorov-Arnold Networks (SKAN) by utilizing a Single-Parameterized Function (SFunc) constructed from trigonometric functions. Three new SKAN variants are developed: LSin-SKAN, LCos-SKAN, and LArctan-SKAN. Experimental validation on the MNIST dataset demonstrates that LArctan-SKAN excels in both accuracy and computational efficiency. Specifically, LArctan-SKAN significantly improves test set accuracy over existing models, outperforming all pure KAN variants compared, including FourierKAN, LSS-SKAN, and Spl-KAN. It also surpasses mixed MLP-based models such as MLP+rKAN and MLP+fKAN in accuracy. Furthermore, LArctan-SKAN exhibits remarkable computational efficiency, with a training speed increase of 535.01% and 49.55% compared to MLP+rKAN and MLP+fKAN, respectively. These results confirm the effectiveness and potential of SKANs constructed with trigonometric functions. The experiment code is available at https://github.com/chikkkit/LArctan-SKAN .
Related papers
- Introducing the Short-Time Fourier Kolmogorov Arnold Network: A Dynamic Graph CNN Approach for Tree Species Classification in 3D Point Clouds [1.4843690728082002]
We introduce STFT-KAN, a novel network that integrates the Short-Time Fourier Transform (STFT)
We implement STFT-KAN within a lightweight version of DGCNN, called liteDGCNN, to classify tree species using the data.
Our experiments show that STFT-KAN outperforms existing KAN-based models with effectively balancing model complexity and performance.
arXiv Detail & Related papers (2025-03-31T01:25:03Z) - Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains.
We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs.
We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z) - PowerMLP: An Efficient Version of KAN [10.411788782126091]
The Kolmogorov-Arnold Network (KAN) is a new network architecture known for its high accuracy in several tasks such as function fitting and PDE solving.
The superior computation capability of KAN arises from the Kolmogorov-Arnold representation and learnable spline functions.
PowerMLP achieves higher accuracy and a training speed about 40 times faster than KAN in various tasks.
arXiv Detail & Related papers (2024-12-18T07:42:34Z) - LSS-SKAN: Efficient Kolmogorov-Arnold Networks based on Single-Parameterized Function [4.198997497722401]
Kolmogorov-Arnold Networks (KAN) networks have attracted increasing attention due to their advantage of high visualizability.
We propose a superior KAN termed SKAN, where the basis function utilizes only a single learnable parameter.
LSS-SKAN exhibited superior performance on the MNIST dataset compared to all tested pure KAN variants.
arXiv Detail & Related papers (2024-10-19T02:44:35Z) - Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains.
However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning.
We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z) - Kolmogorov-Arnold Transformer [72.88137795439407]
We introduce the Kolmogorov-Arnold Transformer (KAT), a novel architecture that replaces layers with Kolmogorov-Arnold Network (KAN) layers.
We identify three key challenges: (C1) Base function, (C2) Inefficiency, and (C3) Weight.
With these designs, KAT outperforms traditional-based transformers.
arXiv Detail & Related papers (2024-09-16T17:54:51Z) - Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons [2.77390041716769]
Kolmogorov-Arnold Networks (KANs) use highly flexible learnable activation functions directly on network edges.
KANs significantly increase the number of learnable parameters, raising concerns about their effectiveness in data-scarce environments.
We show that individualized activation functions achieve significantly higher predictive accuracy with only a modest increase in parameters.
arXiv Detail & Related papers (2024-09-16T16:56:08Z) - Activation Space Selectable Kolmogorov-Arnold Networks [29.450377034478933]
Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods.
Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks.
This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
arXiv Detail & Related papers (2024-08-15T11:34:05Z) - Kolmogorov-Arnold Networks (KAN) for Time Series Classification and Robust Analysis [2.978024452652925]
Kolmogorov-Arnold Networks (KAN) has attracted significant attention as a promising alternative to traditional Multi-Layer Perceptrons (MLP)
Despite their theoretical appeal, KAN require validation on large-scale benchmark datasets.
arXiv Detail & Related papers (2024-08-14T06:15:55Z) - Simple Full-Spectrum Correlated k-Distribution Model based on Multilayer Perceptron [6.354085763851961]
The simple FSCK (SFM) model is developed to compensate among accuracy, efficiency and storage.
Several test cases have been carried out to compare the developed SFM model and other FSCK tools including look-up tables and traditional FSCK (TFM) model.
Results show that the SFM model can achieve excellent accuracy that is even better than look-up tables at a tiny computational cost.
arXiv Detail & Related papers (2024-03-05T08:04:01Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory [79.50644650795012]
We propose a deep learning approach to solve Kohn-Sham Density Functional Theory (KS-DFT)
We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity.
In addition, we show that our approach enables us to explore more complex neural-based wave functions.
arXiv Detail & Related papers (2023-03-01T10:38:10Z) - Scaling & Shifting Your Features: A New Baseline for Efficient Model
Tuning [126.84770886628833]
Existing finetuning methods either tune all parameters of the pretrained model (full finetuning) or only tune the last linear layer (linear probing)
We propose a new parameter-efficient finetuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance full finetuning.
arXiv Detail & Related papers (2022-10-17T08:14:49Z) - Sinkhorn Natural Gradient for Generative Models [125.89871274202439]
We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
arXiv Detail & Related papers (2020-11-09T02:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.