HKAN: Hierarchical Kolmogorov-Arnold Network without Backpropagation
- URL: http://arxiv.org/abs/2501.18199v1
- Date: Thu, 30 Jan 2025 08:44:54 GMT
- Title: HKAN: Hierarchical Kolmogorov-Arnold Network without Backpropagation
- Authors: Grzegorz Dudek, Tomasz Rodak,
- Abstract summary: The Hierarchical Kolmogorov-Arnold Network (HKAN) is a novel network architecture that offers a competitive alternative to the recently proposed Kolmogorov-Arnold Network (KAN)
HKAN adopts a randomized learning approach, where the parameters of its basis functions are fixed, and linear aggregations are optimized using least-squares regression.
Empirical results show that HKAN delivers comparable, if not superior, accuracy and stability relative to KAN across various regression tasks, while also providing insights into variable importance.
- Score: 1.3812010983144802
- License:
- Abstract: This paper introduces the Hierarchical Kolmogorov-Arnold Network (HKAN), a novel network architecture that offers a competitive alternative to the recently proposed Kolmogorov-Arnold Network (KAN). Unlike KAN, which relies on backpropagation, HKAN adopts a randomized learning approach, where the parameters of its basis functions are fixed, and linear aggregations are optimized using least-squares regression. HKAN utilizes a hierarchical multi-stacking framework, with each layer refining the predictions from the previous one by solving a series of linear regression problems. This non-iterative training method simplifies computation and eliminates sensitivity to local minima in the loss function. Empirical results show that HKAN delivers comparable, if not superior, accuracy and stability relative to KAN across various regression tasks, while also providing insights into variable importance. The proposed approach seamlessly integrates theoretical insights with practical applications, presenting a robust and efficient alternative for neural network modeling.
Related papers
- Finite Element Neural Network Interpolation. Part I: Interpretable and Adaptive Discretization for Solving PDEs [44.99833362998488]
We present a sparse neural network architecture extending previous work on Embedded Finite Element Neural Networks (EFENN)
Due to their mesh-based structure, EFENN requires significantly fewer trainable parameters than fully connected neural networks.
Our FENNI framework, within the EFENN framework, brings improvements to the HiDeNN approach.
arXiv Detail & Related papers (2024-12-07T18:31:17Z) - Reimagining Linear Probing: Kolmogorov-Arnold Networks in Transfer Learning [18.69601183838834]
Kolmogorov-Arnold Networks (KAN) is an enhancement to the traditional linear probing method in transfer learning.
KAN consistently outperforms traditional linear probing, achieving significant improvements in accuracy and generalization.
arXiv Detail & Related papers (2024-09-12T05:36:40Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Orthogonal Stochastic Configuration Networks with Adaptive Construction
Parameter for Data Analytics [6.940097162264939]
randomness makes SCNs more likely to generate approximate linear correlative nodes that are redundant and low quality.
In light of a fundamental principle in machine learning, that is, a model with fewer parameters holds improved generalization.
This paper proposes orthogonal SCN, termed OSCN, to filtrate out the low-quality hidden nodes for network structure reduction.
arXiv Detail & Related papers (2022-05-26T07:07:26Z) - Deep Neural Networks for Rank-Consistent Ordinal Regression Based On
Conditional Probabilities [3.093890460224435]
Ordinal regression methods for deep neural networks address ordinal response variables.
CORAL method achieves rank consistency among its output layer tasks by imposing a weight-sharing constraint.
We propose a new method for rank-consistent ordinal regression without this limitation.
arXiv Detail & Related papers (2021-11-17T01:10:23Z) - Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for
sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications.
We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem.
The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z) - Edge Rewiring Goes Neural: Boosting Network Resilience via Policy
Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks.
We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.