Convolutional Kolmogorov-Arnold Networks
- URL: http://arxiv.org/abs/2406.13155v3
- Date: Mon, 31 Mar 2025 12:55:11 GMT
- Title: Convolutional Kolmogorov-Arnold Networks
- Authors: Alexander Dylan Bodner, Antonio Santiago Tepsich, Jack Natan Spolski, Santiago Pourteau,
- Abstract summary: We present Convolutional Kolmogorov-Arnold Networks (KANs)<n>KANs replace traditional fixed-weight kernels with learnable non-linear functions.<n>We empirically evaluate Convolutional KANs on the Fashion-MNIST dataset, demonstrating competitive accuracy with up to 50% fewer parameters compared to baseline CNNs.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present Convolutional Kolmogorov-Arnold Networks, a novel architecture that integrates the learnable spline-based activation functions of Kolmogorov-Arnold Networks (KANs) into convolutional layers. By replacing traditional fixed-weight kernels with learnable non-linear functions, Convolutional KANs offer a significant improvement in parameter efficiency and expressive power over standard Convolutional Neural Networks (CNNs). We empirically evaluate Convolutional KANs on the Fashion-MNIST dataset, demonstrating competitive accuracy with up to 50% fewer parameters compared to baseline classic convolutions. This suggests that the KAN Convolution can effectively capture complex spatial relationships with fewer resources, offering a promising alternative for parameter-efficient deep learning models.
Related papers
- PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures.
KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers.
This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - KANICE: Kolmogorov-Arnold Networks with Interactive Convolutional Elements [1.663204995903499]
We introduce KANICE, a novel neural architecture that combines Convolutional Neural Networks (CNNs) with Kolmogorov-Arnold Network (KAN) principles.
KANICE integrates Interactive Convolutional Blocks (ICBs) and KAN linear layers into a CNN framework.
We evaluated KANICE on four datasets: MNIST, Fashion-MNIST, EMNIST, and SVHN.
arXiv Detail & Related papers (2024-10-22T16:50:34Z) - Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.5852077003870417]
We introduce Residual Arnold, which incorporates the Kolmogorov-KAN framework as a residual component.
Our results demonstrate the potential of RKAN to enhance the capabilities of deep CNNs in visual data.
arXiv Detail & Related papers (2024-10-07T21:12:32Z) - Kolmogorov-Arnold Network Autoencoders [0.0]
Kolmogorov-Arnold Networks (KANs) are promising alternatives to Multi-Layer Perceptrons (MLPs)
KANs align closely with the Kolmogorov-Arnold representation theorem, potentially enhancing both model accuracy and interpretability.
Our results demonstrate that KAN-based autoencoders achieve competitive performance in terms of reconstruction accuracy.
arXiv Detail & Related papers (2024-10-02T22:56:00Z) - Reimagining Linear Probing: Kolmogorov-Arnold Networks in Transfer Learning [18.69601183838834]
Kolmogorov-Arnold Networks (KAN) is an enhancement to the traditional linear probing method in transfer learning.
KAN consistently outperforms traditional linear probing, achieving significant improvements in accuracy and generalization.
arXiv Detail & Related papers (2024-09-12T05:36:40Z) - Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies [0.0]
This paper explores the application of Kolmogorov-Arnold Networks (KANs) in the domain of computer vision (CV)
We propose a parameter-efficient design for Kolmogorov-Arnold convolutional layers and a parameter-efficient finetuning algorithm for pre-trained KAN models.
We provide empirical evaluations conducted on MNIST, CIFAR10, CIFAR100, Tiny ImageNet, ImageNet1k, and HAM10000 datasets for image classification tasks.
arXiv Detail & Related papers (2024-07-01T08:49:33Z) - TKAN: Temporal Kolmogorov-Arnold Networks [0.0]
Long Short-Term Memory (LSTM) has demonstrated its ability to capture long-term dependencies in sequential data.
Inspired by the Kolmogorov-Arnold Networks (KANs) a promising alternatives to Multi-Layer Perceptrons (MLPs)
We propose a new neural networks architecture inspired by KAN and the LSTM, the Temporal Kolomogorov-Arnold Networks (TKANs)
arXiv Detail & Related papers (2024-05-12T17:40:48Z) - Mechanism of feature learning in convolutional neural networks [14.612673151889615]
We identify the mechanism of how convolutional neural networks learn from image data.
We present empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs.
We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
arXiv Detail & Related papers (2023-09-01T16:30:02Z) - From NeurODEs to AutoencODEs: a mean-field control framework for
width-varying Neural Networks [68.8204255655161]
We propose a new type of continuous-time control system, called AutoencODE, based on a controlled field that drives dynamics.
We show that many architectures can be recovered in regions where the loss function is locally convex.
arXiv Detail & Related papers (2023-07-05T13:26:17Z) - Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks [78.11734286268455]
We study the performance of ConvResNeXts, trained with weight decay from the perspective of nonparametric classification.
Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks.
arXiv Detail & Related papers (2023-07-04T11:08:03Z) - Combining Neuro-Evolution of Augmenting Topologies with Convolutional
Neural Networks [0.0]
We combine NeuroEvolution of Augmenting Topologies (NEAT) with Convolutional Neural Networks (CNNs) and propose such a system using blocks of Residual Networks (ResNets)
We explain how our suggested system can only be built once additional optimizations have been made, as genetic algorithms are way more demanding than training per backpropagation.
arXiv Detail & Related papers (2022-10-20T18:41:57Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.