Related papers: Activation Space Selectable Kolmogorov-Arnold Networks

Activation Space Selectable Kolmogorov-Arnold Networks

URL: http://arxiv.org/abs/2408.08338v1
Date: Thu, 15 Aug 2024 11:34:05 GMT
Title: Activation Space Selectable Kolmogorov-Arnold Networks
Authors: Zhuoqin Yang, Jiansong Zhang, Xiaoling Luo, Zheng Lu, Linlin Shen,
Abstract summary: Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods. Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks. This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
Score: 29.450377034478933
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The multilayer perceptron (MLP), a fundamental paradigm in current artificial intelligence, is widely applied in fields such as computer vision and natural language processing. However, the recently proposed Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to MLPs with significantly fewer parameters. Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks. To address this issue, we propose an activation space Selectable KAN (S-KAN). S-KAN employs an adaptive strategy to choose the possible activation mode for data at each feedforward KAN node. Our approach outperforms baseline methods in seven representative function fitting tasks and significantly surpasses MLP methods with the same level of parameters. Furthermore, we extend the structure of S-KAN and propose an activation space selectable Convolutional KAN (S-ConvKAN), which achieves leading results on four general image classification datasets. Our method mitigates the performance variability of the original KAN across different tasks and demonstrates through extensive experiments that feedforward KANs with selectable activations can achieve or even exceed the performance of MLP-based methods. This work contributes to the understanding of the data-centric design of new AI paradigms and provides a foundational reference for innovations in KAN-based network architectures.

Related papers

Improving Memory Efficiency for Training KANs via Meta Learning [55.24089119864207]
We propose to generate weights for KANs via a smaller meta-learner, called MetaKANs.<n>By training KANs and MetaKANs in an end-to-end differentiable manner, MetaKANs achieve comparable or even superior performance.
arXiv Detail & Related papers (2025-06-09T08:38:26Z)
asKAN: Active Subspace embedded Kolmogorov-Arnold Network [2.408451825799214]
The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications. This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem. We propose active subspace embedded KAN, a hierarchical framework that synergizes KAN's function representation with active subspace methodology.
arXiv Detail & Related papers (2025-04-07T01:43:13Z)
AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning [4.843466576537832]
Kolmogorov-Arnold Networks (KANs) have inspired numerous works exploring their applications across a wide range of scientific problems.<n>We introduce Activation Function-Based Kolmogorov-Arnold Networks (AF-KAN), expanding ReLU-KAN with various activations and their function combinations.<n>This novel KAN also incorporates parameter reduction methods, primarily attention mechanisms and data normalization, to enhance performance on image classification datasets.
arXiv Detail & Related papers (2025-03-08T07:38:51Z)
LeanKAN: A Parameter-Lean Kolmogorov-Arnold Network Layer with Improved Memory Efficiency and Convergence Behavior [0.0]
The Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling. Here, we find that MultKAN layers suffer from limited applicability in output layers. We propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers.
arXiv Detail & Related papers (2025-02-25T04:43:41Z)
Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains. We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs. We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z)
Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture [0.922664966526494]
We argue that employing the same activation function at every node is suboptimal and propose leveraging different activation functions at each node to increase flexibility and adaptability. To achieve this, we introduce Local Control Networks (LCNs), which leverage B-spline functions to enable distinct activation curves at each node. Our findings suggest that diverse activations at the node level can lead to improved performance and efficiency.
arXiv Detail & Related papers (2025-01-23T11:34:25Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
MLP-KAN: Unifying Deep Representation and Function Learning [7.634331640151854]
We introduce a unified method designed to eliminate the need for manual model selection. By integrating Multi-Layer Perceptrons (MLPs) for representation learning and Kolmogorov-Arnold Networks (KANsogo) for function learning, we achieve remarkable results.
arXiv Detail & Related papers (2024-10-03T22:22:43Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains. However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning. We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks [43.70716358136333]
Kolmogorov- Networks (KAN) are based on a fundamentally different mathematical framework. KANs address several major issues insio, such as forgetting in continual learning scenarios. We extend the investigation by evaluating the performance of KANs in continual learning tasks within computer vision.
arXiv Detail & Related papers (2024-09-20T14:49:21Z)
KAN v.s. MLP for Offline Reinforcement Learning [4.3621896506713185]
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. In this paper, we explore the incorporation of KAN into the actor and critic networks for offline reinforcement learning.
arXiv Detail & Related papers (2024-09-15T07:52:44Z)
Kolmogorov-Arnold Network for Online Reinforcement Learning [0.22615818641180724]
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks. KANs provide universal function approximation with fewer parameters and reduced memory usage.
arXiv Detail & Related papers (2024-08-09T03:32:37Z)
Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration. A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Exploiting Temporal Structures of Cyclostationary Signals for Data-Driven Single-Channel Source Separation [98.95383921866096]
We study the problem of single-channel source separation (SCSS) We focus on cyclostationary signals, which are particularly suitable in a variety of application domains. We propose a deep learning approach using a U-Net architecture, which is competitive with the minimum MSE estimator.
arXiv Detail & Related papers (2022-08-22T14:04:56Z)
Revisiting GANs by Best-Response Constraint: Perspective, Methodology, and Application [49.66088514485446]
Best-Response Constraint (BRC) is a general learning framework to explicitly formulate the potential dependency of the generator on the discriminator. We show that even with different motivations and formulations, a variety of existing GANs ALL can be uniformly improved by our flexible BRC methodology.
arXiv Detail & Related papers (2022-05-20T12:42:41Z)
Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm. Our model factorizes the source and target data into distinct multi-layer feature spaces. A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.