Related papers: KANs for Computer Vision: An Experimental Study

KANs for Computer Vision: An Experimental Study

URL: http://arxiv.org/abs/2411.18224v2
Date: Thu, 28 Nov 2024 10:00:33 GMT
Title: KANs for Computer Vision: An Experimental Study
Authors: Karthik Mohan, Hanxiao Wang, Xiatian Zhu,
Abstract summary: This paper presents an experimental study of Kolmogorov-Arnold Networks (KANs) applied to computer vision tasks.<n>KANs introduce learnable activation functions on edges, offering flexible non-linear transformations.<n>We reveal that although KANs can perform well in specific vision tasks, they face significant challenges.
Score: 41.93938569894321
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents an experimental study of Kolmogorov-Arnold Networks (KANs) applied to computer vision tasks, particularly image classification. KANs introduce learnable activation functions on edges, offering flexible non-linear transformations compared to traditional pre-fixed activation functions with specific neural work like Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). While KANs have shown promise mostly in simplified or small-scale datasets, their effectiveness for more complex real-world tasks such as computer vision tasks remains less explored. To fill this gap, this experimental study aims to provide extended observations and insights into the strengths and limitations of KANs. We reveal that although KANs can perform well in specific vision tasks, they face significant challenges, including increased hyperparameter sensitivity and higher computational costs. These limitations suggest that KANs require architectural adaptations, such as integration with other architectures, to be practical for large-scale vision problems. This study focuses on empirical findings rather than proposing new methods, aiming to inform future research on optimizing KANs, in particular computer vision applications or alike.

Related papers

Exploring Superposition and Interference in State-of-the-Art Low-Parameter Vision Models [0.0]
We address interference in feature maps, a phenomenon associated with superposition, where neurons simultaneously encode multiple characteristics.<n>Our research suggests that limiting interference can enhance scaling and accuracy in very low-scaled networks (under 1.5M parameters)<n>We propose a proof-of-concept architecture named NoDepth Bottleneck built on mechanistic insights from our experiments, demonstrating robust scaling accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2025-07-21T16:57:25Z)
LSNet: See Large, Focus Small [67.05569159984691]
We introduce LS (textbfLarge-textbfSmall) convolution, which combines large- kernel perception and small- kernel aggregation. LSNet achieves superior performance and efficiency over existing lightweight networks in various vision tasks.
arXiv Detail & Related papers (2025-03-29T16:00:54Z)
Can KAN Work? Exploring the Potential of Kolmogorov-Arnold Networks in Computer Vision [6.554163686640315]
This study first analyzes the potential of KAN in computer vision tasks, evaluating the performance of KAN and its convolutional variants in image classification and semantic segmentation. Results indicate that while KAN exhibits stronger fitting capabilities, it is highly sensitive to noise, limiting its robustness. To address this challenge, we propose a regularization method and introduce a Segment Deactivation technique.
arXiv Detail & Related papers (2024-11-11T05:44:48Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Suitability of KANs for Computer Vision: A preliminary investigation [28.030708956348864]
Kolmogorov-Arnold Networks (KANs) introduce a paradigm of neural modeling that implements learnable functions on the edges of the networks. This work assesses the applicability and efficacy of KANs in visual modeling, focusing on fundamental recognition and segmentation tasks.
arXiv Detail & Related papers (2024-06-13T13:13:17Z)
Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters [4.454976752204893]
This study presents a novel neural network model adept at optical character recognition (OCR) across diverse domains. The model is designed to achieve rapid adaptation to new domains, maintain a compact size conducive to reduced computational resource demand, ensure high accuracy, retain knowledge from previous learning experiences, and allow for domain-specific performance improvements without the need to retrain entirely.
arXiv Detail & Related papers (2024-01-01T23:01:40Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
ASU-CNN: An Efficient Deep Architecture for Image Classification and Feature Visualizations [0.0]
Activation functions play a decisive role in determining the capacity of Deep Neural Networks. In this paper, a Convolutional Neural Network model named as ASU-CNN is proposed. The network achieved promising results on both training and testing data for the classification of CIFAR-10.
arXiv Detail & Related papers (2023-05-28T16:52:25Z)
Video Coding for Machine: Compact Visual Representation Compression for Intelligent Collaborative Analytics [101.35754364753409]
Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression. This paper summarizes VCM methodology and philosophy based on existing academia and industrial efforts.
arXiv Detail & Related papers (2021-10-18T12:42:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.