KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example
- URL: http://arxiv.org/abs/2408.02743v1
- Date: Mon, 5 Aug 2024 18:01:07 GMT
- Title: KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example
- Authors: Johannes Erdmann, Florian Mausolf, Jan Lukas Späh,
- Abstract summary: Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons.
We study a typical binary event classification task in high-energy physics.
We find that the learned activation functions of a one-layer KAN resemble the log-likelihood ratio of the input features.
- Score: 0.08192907805418582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons, suggesting advantages in performance and interpretability. We study a typical binary event classification task in high-energy physics including high-level features and comment on the performance and interpretability of KANs in this context. We find that the learned activation functions of a one-layer KAN resemble the log-likelihood ratio of the input features. In deeper KANs, the activations in the first KAN layer differ from those in the one-layer KAN, which indicates that the deeper KANs learn more complex representations of the data. We study KANs with different depths and widths and we compare them to multilayer perceptrons in terms of performance and number of trainable parameters. For the chosen classification task, we do not find that KANs are more parameter efficient. However, small KANs may offer advantages in terms of interpretability that come at the cost of only a moderate loss in performance.
Related papers
- Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains.
We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs.
We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z) - PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures.
KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers.
This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z) - Exploring Kolmogorov-Arnold Networks for Interpretable Time Series Classification [0.17999333451993949]
Kolmogorov-Arnold Networks (KANs) have been proposed as a more interpretable alternative to state-of-the-art models.
In this paper, we aim to conduct a comprehensive and robust exploration of the KAN architecture for time series classification.
Our results show that (1) Efficient KAN outperforms in performance and computational efficiency, showcasing its suitability for tasks classification tasks.
arXiv Detail & Related papers (2024-11-22T13:01:36Z) - On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks [56.78271181959529]
Kolmogorov--Arnold Networks (KANs) have gained significant attention in the deep learning community.
Empirical investigations demonstrate that KANs optimized via gradient descent (SGD) are capable of achieving near-zero training loss.
arXiv Detail & Related papers (2024-10-10T15:34:10Z) - Activation Space Selectable Kolmogorov-Arnold Networks [29.450377034478933]
Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods.
Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks.
This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
arXiv Detail & Related papers (2024-08-15T11:34:05Z) - Rethinking the Function of Neurons in KANs [1.223779595809275]
The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem.
In this work, we investigate the potential for identifying an alternative multivariate function for KAN neurons that may offer increased practical utility.
arXiv Detail & Related papers (2024-07-30T09:04:23Z) - SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions [0.0]
We present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN)
We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions.
arXiv Detail & Related papers (2024-07-04T20:53:19Z) - U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation [48.40120035775506]
Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions.
We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN.
We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
arXiv Detail & Related papers (2024-06-05T04:13:03Z) - WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z) - Learning distinct features helps, provably [98.78384185493624]
We study the diversity of the features learned by a two-layer neural network trained with the least squares loss.
We measure the diversity by the average $L$-distance between the hidden-layer features.
arXiv Detail & Related papers (2021-06-10T19:14:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.