Related papers: Scalable and Interpretable Scientific Discovery via Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KAN)

Scalable and Interpretable Scientific Discovery via Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KAN)

URL: http://arxiv.org/abs/2512.00260v1
Date: Sat, 29 Nov 2025 00:48:55 GMT
Title: Scalable and Interpretable Scientific Discovery via Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KAN)
Authors: Y. Sungtaek Ju,
Abstract summary: Kolmogorov-Arnold Networks (KANs) offer a promising alternative to Multi-Layer Perceptron (MLP)<n>KANs lack probabilistic outputs, limiting their utility in applications requiring uncertainty quantification.<n>We introduce the Sparse Variational GP-KAN, an architecture that integrates sparse variational inference with the KAN topology.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Kolmogorov-Arnold Networks (KANs) offer a promising alternative to Multi-Layer Perceptron (MLP) by placing learnable univariate functions on network edges, enhancing interpretability. However, standard KANs lack probabilistic outputs, limiting their utility in applications requiring uncertainty quantification. While recent Gaussian Process (GP) extensions to KANs address this, they utilize exact inference methods that scale cubically with data size N, restricting their application to smaller datasets. We introduce the Sparse Variational GP-KAN (SVGP-KAN), an architecture that integrates sparse variational inference with the KAN topology. By employing $M$ inducing points and analytic moment matching, our method reduces computational complexity from $O(N^3)$ to $O(NM^2)$ or linear in sample size, enabling the application of probabilistic KANs to larger scientific datasets. Furthermore, we demonstrate that integrating a permutation-based importance analysis enables the network to function as a framework for structural identification, identifying relevant inputs and classifying functional relationships.

Related papers

Optimal Abstractions for Verifying Properties of Kolmogorov-Arnold Networks (KANs) [8.114307305249929]
We present a novel approach for verifying properties of Kolmogorov-Arnold Networks (KANs)<n>Our key contribution is a systematic framework that exploits KAN structure to find optimal abstractions.<n>This approach determines the optimal approximation strategy for each unit while maintaining overall accuracy requirements.
arXiv Detail & Related papers (2026-02-06T14:33:41Z)
Uncertainty Quantification for Scientific Machine Learning using Sparse Variational Gaussian Process Kolmogorov-Arnold Networks (SVGP KAN) [0.0]
Kolmogorov-Arnold Networks have emerged as interpretable alternatives to traditional multi-layer perceptrons.<n>We present a framework integrating sparse variational Gaussian process inference with the Kolmogorov-Arnold topology.
arXiv Detail & Related papers (2025-12-04T22:58:32Z)
(Sometimes) Less is More: Mitigating the Complexity of Rule-based Representation for Interpretable Classification [1.6504157612470989]
Deep neural networks are widely used in practical applications of AI, but their inner structure and complexity made them generally not easily interpretable.<n>In this work, a differentiable approximation of $L_0$ regularization is adapted into a logic-based neural network, the Multi-layer Logical Perceptron (MLLP), to study its efficacy in reducing the complexity of its discrete interpretable version, the Concept Rule Set (CRS)<n>The results are compared to alternatives like Random Binarization of the network weights, to determine if better results can be achieved when using a less-noisy technique that sparsifies the
arXiv Detail & Related papers (2025-09-26T14:13:08Z)
The Power of Random Features and the Limits of Distribution-Free Gradient Descent [14.742677437485273]
We study the relationship between gradient-based optimization of parametric models (e.g., neural networks) and optimization of linear combinations of random features.<n>Our main result shows that if a parametric model can be learned using mini-batch gradient descent (bSGD) without making assumptions about the data distribution, then with high probability, the target function can also be approximated.
arXiv Detail & Related papers (2025-05-15T15:39:28Z)
Geometric Neural Process Fields [58.77241763774756]
Geometric Neural Process Fields (G-NPF) is a probabilistic framework for neural radiance fields that explicitly captures uncertainty.<n>Building on these bases, we design a hierarchical latent variable model, allowing G-NPF to integrate structural information across multiple spatial levels.<n> Experiments on novel-view synthesis for 3D scenes, as well as 2D image and 1D signal regression, demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2025-02-04T14:17:18Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains.<n>We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
Positional Encoder Graph Quantile Neural Networks for Geographic Data [4.277516034244117]
We propose a novel framework that combines PE-GNNs with Quantile Neural Networks, partially monotonic neural blocks, and post-hoc recalibration techniques.<n>The PE-GQNN enables flexible and robust conditional density estimation with minimal assumptions about the target distribution, and it extends naturally to tasks beyond spatial data.
arXiv Detail & Related papers (2024-09-27T16:02:12Z)
Scalable Neural Network Kernels [22.299704296356836]
We introduce scalable neural network kernels (SNNKs), capable of approximating regular feedforward layers (FFLs) We also introduce the neural network bundling process that applies SNNKs to compactify deep neural network architectures. Our mechanism provides up to 5x reduction in the number of trainable parameters, while maintaining competitive accuracy.
arXiv Detail & Related papers (2023-10-20T02:12:56Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability. We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z)
Probabilistic Circuits for Variational Inference in Discrete Graphical Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult. Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO) We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN) We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.