AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning
- URL: http://arxiv.org/abs/2503.06112v1
- Date: Sat, 08 Mar 2025 07:38:51 GMT
- Title: AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning
- Authors: Hoang-Thang Ta, Anh Tran,
- Abstract summary: Kolmogorov-Arnold Networks (KANs) have inspired numerous works exploring their applications across a wide range of scientific problems.<n>We introduce Activation Function-Based Kolmogorov-Arnold Networks (AF-KAN), expanding ReLU-KAN with various activations and their function combinations.<n>This novel KAN also incorporates parameter reduction methods, primarily attention mechanisms and data normalization, to enhance performance on image classification datasets.
- Score: 4.843466576537832
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Kolmogorov-Arnold Networks (KANs) have inspired numerous works exploring their applications across a wide range of scientific problems, with the potential to replace Multilayer Perceptrons (MLPs). While many KANs are designed using basis and polynomial functions, such as B-splines, ReLU-KAN utilizes a combination of ReLU functions to mimic the structure of B-splines and take advantage of ReLU's speed. However, ReLU-KAN is not built for multiple inputs, and its limitations stem from ReLU's handling of negative values, which can restrict feature extraction. To address these issues, we introduce Activation Function-Based Kolmogorov-Arnold Networks (AF-KAN), expanding ReLU-KAN with various activations and their function combinations. This novel KAN also incorporates parameter reduction methods, primarily attention mechanisms and data normalization, to enhance performance on image classification datasets. We explore different activation functions, function combinations, grid sizes, and spline orders to validate the effectiveness of AF-KAN and determine its optimal configuration. In the experiments, AF-KAN significantly outperforms MLP, ReLU-KAN, and other KANs with the same parameter count. It also remains competitive even when using fewer than 6 to 10 times the parameters while maintaining the same network structure. However, AF-KAN requires a longer training time and consumes more FLOPs. The repository for this work is available at https://github.com/hoangthangta/All-KAN.
Related papers
- LeanKAN: A Parameter-Lean Kolmogorov-Arnold Network Layer with Improved Memory Efficiency and Convergence Behavior [0.0]
The Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling.<n>Here, we find that MultKAN layers suffer from limited applicability in output layers.<n>We propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers.
arXiv Detail & Related papers (2025-02-25T04:43:41Z) - Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains.
We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs.
We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z) - Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture [0.922664966526494]
We argue that employing the same activation function at every node is suboptimal and propose leveraging different activation functions at each node to increase flexibility and adaptability.<n>To achieve this, we introduce Local Control Networks (LCNs), which leverage B-spline functions to enable distinct activation curves at each node.<n>Our findings suggest that diverse activations at the node level can lead to improved performance and efficiency.
arXiv Detail & Related papers (2025-01-23T11:34:25Z) - Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability [16.957071012748454]
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community.<n>However, their implementation often suffers from poor training stability and heavy trainable parameter.<n>In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs.
arXiv Detail & Related papers (2025-01-16T04:12:05Z) - PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures.<n>KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers.<n>This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z) - Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
We propose Equivariant Kolmogorov-Arnold Networks (EKAN), a method for incorporating arbitrary matrix group equivariants into KANs.
EKAN achieves higher accuracy with smaller datasets or fewer parameters on symmetry-related tasks, such as particle scattering and the three-body problem.
arXiv Detail & Related papers (2024-10-01T06:34:58Z) - Kolmogorov-Arnold Transformer [72.88137795439407]
We introduce the Kolmogorov-Arnold Transformer (KAT), a novel architecture that replaces layers with Kolmogorov-Arnold Network (KAN) layers.
We identify three key challenges: (C1) Base function, (C2) Inefficiency, and (C3) Weight.
With these designs, KAT outperforms traditional-based transformers.
arXiv Detail & Related papers (2024-09-16T17:54:51Z) - FC-KAN: Function Combinations in Kolmogorov-Arnold Networks [48.39771439237495]
We introduce FC-KAN, a Kolmogorov-Arnold Network (KAN) that leverages popular mathematical functions on low-dimensional data.<n>We compare FC-KAN with a multi-layer perceptron network (MLP) and other existing KANs, such as BSRBF-KAN, EfficientKAN, FastKAN, and FasterKAN.<n>Two variants of FC-KAN, which use a combination of outputs from B-splines and Difference of Gaussians (DoG) and from B-splines and linear transformations in the form of a quadratic function, outperformed overall other models on the average
arXiv Detail & Related papers (2024-09-03T10:16:43Z) - Activation Space Selectable Kolmogorov-Arnold Networks [29.450377034478933]
Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods.
Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks.
This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
arXiv Detail & Related papers (2024-08-15T11:34:05Z) - Evolution of Activation Functions for Deep Learning-Based Image
Classification [0.0]
Activation functions (AFs) play a pivotal role in the performance of neural networks.
We propose a novel, three-population, coevolutionary algorithm to evolve AFs.
Tested on four datasets -- MNIST, FashionMNIST, KMNIST, and USPS -- coevolution proves to be a performant algorithm for finding good AFs and AF architectures.
arXiv Detail & Related papers (2022-06-24T05:58:23Z) - Comparisons among different stochastic selection of activation layers
for convolutional neural networks for healthcare [77.99636165307996]
We classify biomedical images using ensembles of neural networks.
We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish, Mish, Mexican Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign.
arXiv Detail & Related papers (2020-11-24T01:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.