Geometric Kolmogorov-Arnold Superposition Theorem
- URL: http://arxiv.org/abs/2502.16664v1
- Date: Sun, 23 Feb 2025 17:47:33 GMT
- Title: Geometric Kolmogorov-Arnold Superposition Theorem
- Authors: Francesco Alesiani, Takashi Maruyama, Henrik Christiansen, Viktor Zaverkin,
- Abstract summary: The Kolmogorov-Arnold Network (KAN) was introduced as a trainable model to implement the Kolmogorov-Arnold Theorem (KAT)<n>We propose a novel extension of KAT and KAN to incorporate equivariance and invariance over $O(n)$ group actions, enabling accurate and efficient modeling of physical systems.
- Score: 9.373581450684233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Kolmogorov-Arnold Theorem (KAT), or more generally, the Kolmogorov Superposition Theorem (KST), establishes that any non-linear multivariate function can be exactly represented as a finite superposition of non-linear univariate functions. Unlike the universal approximation theorem, which provides only an approximate representation without guaranteeing a fixed network size, KST offers a theoretically exact decomposition. The Kolmogorov-Arnold Network (KAN) was introduced as a trainable model to implement KAT, and recent advancements have adapted KAN using concepts from modern neural networks. However, KAN struggles to effectively model physical systems that require inherent equivariance or invariance to $E(3)$ transformations, a key property for many scientific and engineering applications. In this work, we propose a novel extension of KAT and KAN to incorporate equivariance and invariance over $O(n)$ group actions, enabling accurate and efficient modeling of these systems. Our approach provides a unified approach that bridges the gap between mathematical theory and practical architectures for physical systems, expanding the applicability of KAN to a broader class of problems.
Related papers
- asKAN: Active Subspace embedded Kolmogorov-Arnold Network [2.408451825799214]
The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications.
This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem.
We propose active subspace embedded KAN, a hierarchical framework that synergizes KAN's function representation with active subspace methodology.
arXiv Detail & Related papers (2025-04-07T01:43:13Z) - HKAN: Hierarchical Kolmogorov-Arnold Network without Backpropagation [1.3812010983144802]
The Hierarchical Kolmogorov-Arnold Network (HKAN) is a novel network architecture that offers a competitive alternative to the recently proposed Kolmogorov-Arnold Network (KAN)<n>HKAN adopts a randomized learning approach, where the parameters of its basis functions are fixed, and linear aggregations are optimized using least-squares regression.<n> Empirical results show that HKAN delivers comparable, if not superior, accuracy and stability relative to KAN across various regression tasks, while also providing insights into variable importance.
arXiv Detail & Related papers (2025-01-30T08:44:54Z) - Nested Annealed Training Scheme for Generative Adversarial Networks [54.70743279423088]
This paper focuses on a rigorous mathematical theoretical framework: the composite-functional-gradient GAN (CFG)
We reveal the theoretical connection between the CFG model and score-based models.
We find that the training objective of the CFG discriminator is equivalent to finding an optimal D(x)
arXiv Detail & Related papers (2025-01-20T07:44:09Z) - Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
We propose Equivariant Kolmogorov-Arnold Networks (EKAN), a method for incorporating arbitrary matrix group equivariants into KANs.<n>EKAN achieves higher accuracy with smaller datasets or fewer parameters on symmetry-related tasks, such as particle scattering and the three-body problem.
arXiv Detail & Related papers (2024-10-01T06:34:58Z) - Distribution learning via neural differential equations: a nonparametric
statistical perspective [1.4436965372953483]
This work establishes the first general statistical convergence analysis for distribution learning via ODE models trained through likelihood transformations.
We show that the latter can be quantified via the $C1$-metric entropy of the class $mathcal F$.
We then apply this general framework to the setting of $Ck$-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes $mathcal F$: $Ck$ functions and neural networks.
arXiv Detail & Related papers (2023-09-03T00:21:37Z) - Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and
Group Convolution [90.67482899242093]
A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input.
We provide an in-depth analysis of both methods and their equivalence and relate the two constructions to multiview convolutional networks.
We also derive new TFN non-linearities from our equivalence principle and test them on practical benchmark datasets.
arXiv Detail & Related papers (2022-11-29T03:42:11Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - Universal Approximation Property of Neural Ordinary Differential
Equations [19.861764482790544]
We show that NODEs can form an $Lp$-universal approximator for continuous maps under certain conditions.
We also show their stronger approximation property, namely the $sup$-universality for approximating a large class of diffeomorphisms.
arXiv Detail & Related papers (2020-12-04T05:53:21Z) - Deep Conditional Transformation Models [0.0]
Learning the cumulative distribution function (CDF) of an outcome variable conditional on a set of features remains challenging.
Conditional transformation models provide a semi-parametric approach that allows to model a large class of conditional CDFs.
We propose a novel network architecture, provide details on different model definitions and derive suitable constraints.
arXiv Detail & Related papers (2020-10-15T16:25:45Z) - Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers.
We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.