Related papers: On the expressiveness and spectral bias of KANs

On the expressiveness and spectral bias of KANs

URL: http://arxiv.org/abs/2410.01803v1
Date: Wed, 2 Oct 2024 17:57:38 GMT
Title: On the expressiveness and spectral bias of KANs
Authors: Yixuan Wang, Jonathan W. Siegel, Ziming Liu, Thomas Y. Hou,
Abstract summary: KANs were recently proposed as a potential alternative to the prevalent architectural backbone of many deep learning models, the multi-layer perceptron (MLP) KANs have seen success in various tasks of AI for science, with their empirical efficiency and accuracy demostrated in function regression, PDE solving, and many more scientific problems.
Score: 17.42614039265962
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Kolmogorov-Arnold Networks (KAN) \cite{liu2024kan} were very recently proposed as a potential alternative to the prevalent architectural backbone of many deep learning models, the multi-layer perceptron (MLP). KANs have seen success in various tasks of AI for science, with their empirical efficiency and accuracy demostrated in function regression, PDE solving, and many more scientific problems. In this article, we revisit the comparison of KANs and MLPs, with emphasis on a theoretical perspective. On the one hand, we compare the representation and approximation capabilities of KANs and MLPs. We establish that MLPs can be represented using KANs of a comparable size. This shows that the approximation and representation capabilities of KANs are at least as good as MLPs. Conversely, we show that KANs can be represented using MLPs, but that in this representation the number of parameters increases by a factor of the KAN grid size. This suggests that KANs with a large grid size may be more efficient than MLPs at approximating certain functions. On the other hand, from the perspective of learning and optimization, we study the spectral bias of KANs compared with MLPs. We demonstrate that KANs are less biased toward low frequencies than MLPs. We highlight that the multi-level learning feature specific to KANs, i.e. grid extension of splines, improves the learning process for high-frequency components. Detailed comparisons with different choices of depth, width, and grid sizes of KANs are made, shedding some light on how to choose the hyperparameters in practice.

Related papers

Improving Memory Efficiency for Training KANs via Meta Learning [55.24089119864207]
We propose to generate weights for KANs via a smaller meta-learner, called MetaKANs.<n>By training KANs and MetaKANs in an end-to-end differentiable manner, MetaKANs achieve comparable or even superior performance.
arXiv Detail & Related papers (2025-06-09T08:38:26Z)
Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains. We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs. We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z)
PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures. KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers. This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z)
PowerMLP: An Efficient Version of KAN [10.411788782126091]
The Kolmogorov-Arnold Network (KAN) is a new network architecture known for its high accuracy in several tasks such as function fitting and PDE solving. The superior computation capability of KAN arises from the Kolmogorov-Arnold representation and learnable spline functions. PowerMLP achieves higher accuracy and a training speed about 40 times faster than KAN in various tasks.
arXiv Detail & Related papers (2024-12-18T07:42:34Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains. However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning. We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks [43.70716358136333]
Kolmogorov- Networks (KAN) are based on a fundamentally different mathematical framework. KANs address several major issues insio, such as forgetting in continual learning scenarios. We extend the investigation by evaluating the performance of KANs in continual learning tasks within computer vision.
arXiv Detail & Related papers (2024-09-20T14:49:21Z)
KAN v.s. MLP for Offline Reinforcement Learning [4.3621896506713185]
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. In this paper, we explore the incorporation of KAN into the actor and critic networks for offline reinforcement learning.
arXiv Detail & Related papers (2024-09-15T07:52:44Z)
Kolmogorov-Arnold Network for Online Reinforcement Learning [0.22615818641180724]
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks. KANs provide universal function approximation with fewer parameters and reduced memory usage.
arXiv Detail & Related papers (2024-08-09T03:32:37Z)
KAN or MLP: A Fairer Comparison [63.794304207664176]
This paper offers a fairer and more comprehensive comparison of KAN and models across various tasks. We control the number of parameters and FLOPs to compare the performance of KAN and representation. We find that KAN's issue is more severe than that of forgetting in a standard class-incremental continual learning setting.
arXiv Detail & Related papers (2024-07-23T17:43:35Z)
KAN: Kolmogorov-Arnold Networks [16.782018138008578]
We propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs) KANs have learnable activation functions on edges ("weights") We show that this seemingly simple change makes KANs outperform neurals in terms of accuracy and interpretability.
arXiv Detail & Related papers (2024-04-30T17:58:29Z)
SCHEME: Scalable Channel Mixer for Vision Transformers [52.605868919281086]
Vision Transformers have achieved impressive performance in many vision tasks. Much less research has been devoted to the channel mixer or feature mixing block (FFN or) We show that the dense connections can be replaced with a diagonal block structure that supports larger expansion ratios.
arXiv Detail & Related papers (2023-12-01T08:22:34Z)
ReLU Fields: The Little Non-linearity That Could [62.228229880658404]
We investigate what is the smallest change to grid-based representations that allows for retaining the high fidelity result ofs. We show that such an approach becomes competitive with the state-of-the-art.
arXiv Detail & Related papers (2022-05-22T13:42:31Z)
On Graph Neural Networks versus Graph-Augmented MLPs [51.23890789522705]
Graph-Augmented Multi-Layer Perceptrons (GA-MLPs) first augments node features with certain multi-hop operators on the graph. We prove a separation in expressive power between GA-MLPs and GNNs that grows exponentially in depth.
arXiv Detail & Related papers (2020-10-28T17:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.