Related papers: FC-KAN: Function Combinations in Kolmogorov-Arnold Networks

FC-KAN: Function Combinations in Kolmogorov-Arnold Networks

URL: http://arxiv.org/abs/2409.01763v2
Date: Mon, 14 Oct 2024 06:07:40 GMT
Title: FC-KAN: Function Combinations in Kolmogorov-Arnold Networks
Authors: Hoang-Thang Ta, Duy-Quy Thai, Abu Bakar Siddiqur Rahman, Grigori Sidorov, Alexander Gelbukh,
Abstract summary: We introduce FC-KAN, a Kolmogorov-Arnold Network (KAN) that leverages popular mathematical functions on low-dimensional data. We compare FC-KAN with multi-layer perceptron network (MLP) and other existing KANs, such as BSRBF-KAN, EfficientKAN, FastKAN, and FasterKAN. A variant of FC-KAN, which uses a combination of outputs from B-splines and Difference of Gaussians (DoG) in the form of a quadratic function, outperformed all other models on the average of 5 independent training runs.
Score: 48.39771439237495
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we introduce FC-KAN, a Kolmogorov-Arnold Network (KAN) that leverages combinations of popular mathematical functions such as B-splines, wavelets, and radial basis functions on low-dimensional data through element-wise operations. We explore several methods for combining the outputs of these functions, including sum, element-wise product, the addition of sum and element-wise product, quadratic function representation, and concatenation. In our experiments, we compare FC-KAN with multi-layer perceptron network (MLP) and other existing KANs, such as BSRBF-KAN, EfficientKAN, FastKAN, and FasterKAN, on the MNIST and Fashion-MNIST datasets. A variant of FC-KAN, which uses a combination of outputs from B-splines and Difference of Gaussians (DoG) in the form of a quadratic function, outperformed all other models on the average of 5 independent training runs. We expect that FC-KAN can leverage function combinations to design future KANs. Our repository is publicly available at: https://github.com/hoangthangta/FC_KAN.

Related papers

asKAN: Active Subspace embedded Kolmogorov-Arnold Network [2.408451825799214]
The Kolmogorov-Arnold Network (KAN) has emerged as a promising neural network architecture for small-scale AI+Science applications. This study investigates this inflexibility through the lens of the Kolmogorov-Arnold theorem. We propose active subspace embedded KAN, a hierarchical framework that synergizes KAN's function representation with active subspace methodology.
arXiv Detail & Related papers (2025-04-07T01:43:13Z)
AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning [4.843466576537832]
Kolmogorov-Arnold Networks (KANs) have inspired numerous works exploring their applications across a wide range of scientific problems. We introduce Activation Function-Based Kolmogorov-Arnold Networks (AF-KAN), expanding ReLU-KAN with various activations and their function combinations. This novel KAN also incorporates parameter reduction methods, primarily attention mechanisms and data normalization, to enhance performance on image classification datasets.
arXiv Detail & Related papers (2025-03-08T07:38:51Z)
LSS-SKAN: Efficient Kolmogorov-Arnold Networks based on Single-Parameterized Function [4.198997497722401]
Kolmogorov-Arnold Networks (KAN) networks have attracted increasing attention due to their advantage of high visualizability. We propose a superior KAN termed SKAN, where the basis function utilizes only a single learnable parameter. LSS-SKAN exhibited superior performance on the MNIST dataset compared to all tested pure KAN variants.
arXiv Detail & Related papers (2024-10-19T02:44:35Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains. However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning. We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
Kolmogorov-Arnold Transformer [72.88137795439407]
We introduce the Kolmogorov-Arnold Transformer (KAT), a novel architecture that replaces layers with Kolmogorov-Arnold Network (KAN) layers. We identify three key challenges: (C1) Base function, (C2) Inefficiency, and (C3) Weight. With these designs, KAT outperforms traditional-based transformers.
arXiv Detail & Related papers (2024-09-16T17:54:51Z)
Rethinking the Function of Neurons in KANs [1.223779595809275]
The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem. In this work, we investigate the potential for identifying an alternative multivariate function for KAN neurons that may offer increased practical utility.
arXiv Detail & Related papers (2024-07-30T09:04:23Z)
BSRBF-KAN: A combination of B-splines and Radial Basis Functions in Kolmogorov-Arnold Networks [3.844398528249339]
We introduce BSRBF-KAN, a Kolmogorov Arnold Network (KAN) that combines B-splines and radial basis functions (RBFs) to fit input vectors during data training. BSRBF-KAN shows stability in 5 training runs with a competitive average accuracy of 97.55% on MNIST and 89.33% on Fashion-MNIST.
arXiv Detail & Related papers (2024-06-17T03:26:02Z)
NPEFF: Non-Negative Per-Example Fisher Factorization [52.44573961263344]
We introduce a novel interpretability method called NPEFF that is readily applicable to any end-to-end differentiable model. We demonstrate that NPEFF has interpretable tunings through experiments on language and vision models.
arXiv Detail & Related papers (2023-10-07T02:02:45Z)
Graph-Regularized Manifold-Aware Conditional Wasserstein GAN for Brain Functional Connectivity Generation [13.009230460620369]
We propose a graph-regularized conditional Wasserstein GAN (GR-SPD-GAN) for FC data generation on the SPD manifold. The GR-SPD-GAN clearly outperforms several state-of-the-art GANs in generating more realistic fMRI-based FC samples.
arXiv Detail & Related papers (2022-12-10T14:51:44Z)
Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network. We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling. We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z)
Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data. We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way. We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z)
Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality. We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z)
Reciprocal Adversarial Learning via Characteristic Functions [12.961770002117142]
Generative adversarial nets (GANs) have become a preferred tool for tasks involving complicated distributions. We show how to use the characteristic function (CF) to compare the distributions rather than their moments. We then prove an equivalence between the embedded and data domains when a reciprocal exists, where we naturally develop the GAN in an auto-encoder structure. This efficient structure uses only two modules, together with a simple training strategy, to achieve bi-directionally generating clear images.
arXiv Detail & Related papers (2020-06-15T14:04:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.