Related papers: asKAN: Active Subspace embedded Kolmogorov-Arnold Network

asKAN: Active Subspace embedded Kolmogorov-Arnold Network

URL: http://arxiv.org/abs/2504.04669v2
Date: Wed, 09 Apr 2025 08:34:59 GMT
Title: asKAN: Active Subspace embedded Kolmogorov-Arnold Network
Authors: Zhiteng Zhou, Zhaoyue Xu, Yi Liu, Shizhao Wang,
Abstract summary: The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications.<n>This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem.<n>We propose active subspace embedded KAN, a hierarchical framework that synergizes KAN's function representation with active subspace methodology.
Score: 2.408451825799214
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications. However, it suffers from inflexibility in modeling ridge functions, which is widely used in representing the relationships in physical systems. This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem, which starts the representation of multivariate functions from constructing the univariate components rather than combining the independent variables. Our analysis reveals that incorporating linear combinations of independent variables can substantially simplify the network architecture in representing the ridge functions. Inspired by this finding, we propose active subspace embedded KAN (asKAN), a hierarchical framework that synergizes KAN's function representation with active subspace methodology. The architecture strategically embeds active subspace detection between KANs, where the active subspace method is used to identify the primary ridge directions and the independent variables are adaptively projected onto these critical dimensions. The proposed asKAN is implemented in an iterative way without increasing the number of neurons in the original KAN. The proposed method is validated through function fitting, solving the Poisson equation, and reconstructing sound field. Compared with KAN, asKAN significantly reduces the error using the same network architecture. The results suggest that asKAN enhances the capability of KAN in fitting and solving equations in the form of ridge functions.

Related papers

Variational Kolmogorov-Arnold Network [10.822246003257563]
Kolmogorov Arnold Networks (KANs) are an emerging architecture for building machine learning models.<n>KANs are based on the theoretical foundation of the Kolmogorov-Arnold Theorem and its expansions.
arXiv Detail & Related papers (2025-07-03T09:24:09Z)
LeanKAN: A Parameter-Lean Kolmogorov-Arnold Network Layer with Improved Memory Efficiency and Convergence Behavior [0.0]
The Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling.<n>Here, we find that MultKAN layers suffer from limited applicability in output layers.<n>We propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers.
arXiv Detail & Related papers (2025-02-25T04:43:41Z)
Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning [56.22841701016295]
Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the "Node-Edge approach"
arXiv Detail & Related papers (2025-02-15T19:10:35Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
Neural Control Variates with Automatic Integration [49.91408797261987]
This paper proposes a novel approach to construct learnable parametric control variates functions from arbitrary neural network architectures. We use the network to approximate the anti-derivative of the integrand. We apply our method to solve partial differential equations using the Walk-on-sphere algorithm.
arXiv Detail & Related papers (2024-09-23T06:04:28Z)
Activation Space Selectable Kolmogorov-Arnold Networks [29.450377034478933]
Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods. Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks. This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
arXiv Detail & Related papers (2024-08-15T11:34:05Z)
Rethinking the Function of Neurons in KANs [1.223779595809275]
The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem. In this work, we investigate the potential for identifying an alternative multivariate function for KAN neurons that may offer increased practical utility.
arXiv Detail & Related papers (2024-07-30T09:04:23Z)
KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics [0.0]
Kolmogorov-Arnold networks (KANs) are an alternative to multi-layer perceptrons (MLPs) This work applies KANs as the backbone of a neural ordinary differential equation (ODE) framework.
arXiv Detail & Related papers (2024-07-05T00:38:49Z)
Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions [0.552480439325792]
storage capacity of a binary classification model is the maximum number of random input-output pairs per parameter that the model can learn. We analyze the structure of the solution space and the storage capacity of fully connected two-layer neural networks with general activation functions.
arXiv Detail & Related papers (2024-04-20T15:12:47Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Deep Archimedean Copulas [98.96141706464425]
ACNet is a novel differentiable neural network architecture that enforces structural properties. We show that ACNet is able to both approximate common Archimedean Copulas and generate new copulas which may provide better fits to data.
arXiv Detail & Related papers (2020-12-05T22:58:37Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.