"KAN you hear me?" Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding
- URL: http://arxiv.org/abs/2505.20176v1
- Date: Mon, 26 May 2025 16:16:44 GMT
- Title: "KAN you hear me?" Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding
- Authors: Alkis Koudounas, Moreno La Quatra, Eliana Pastor, Sabato Marco Siniscalchi, Elena Baralis,
- Abstract summary: Kolmogorov-Arnold Networks (KANs) have emerged as a promising alternative to traditional neural architectures.<n>This work presents the first investigation of KANs for Spoken Language Understanding (SLU) tasks.
- Score: 22.115368954133864
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Kolmogorov-Arnold Networks (KANs) have recently emerged as a promising alternative to traditional neural architectures, yet their application to speech processing remains under explored. This work presents the first investigation of KANs for Spoken Language Understanding (SLU) tasks. We experiment with 2D-CNN models on two datasets, integrating KAN layers in five different configurations within the dense block. The best-performing setup, which places a KAN layer between two linear layers, is directly applied to transformer-based models and evaluated on five SLU datasets with increasing complexity. Our results show that KAN layers can effectively replace the linear layers, achieving comparable or superior performance in most cases. Finally, we provide insights into how KAN and linear layers on top of transformers differently attend to input regions of the raw waveforms.
Related papers
- Input Conditioned Layer Dropping in Speech Foundation Models [11.05223262950967]
layer dropping ($mathcalLD$) skips fraction of the layers of a backbone network during inference to reduce the computational load.<n>We propose input-driven $mathcalLD$ that employs the network's input features and a lightweight layer selecting network to determine optimum combination of processing layers.
arXiv Detail & Related papers (2025-07-10T17:39:03Z) - Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation [8.891804836416275]
We propose a novel semantic segmentation network, namely DeepKANSeg.<n>First, we introduce a KAN-based deep feature refinement module, namely DeepKAN.<n>Second, we replace the traditional multi-layer perceptron (MLP) layers in the global-local combined decoder with KAN-based linear layers, namely GLKAN.
arXiv Detail & Related papers (2025-01-13T15:06:51Z) - PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures.<n>KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers.<n>This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z) - KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example [0.08192907805418582]
Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons.<n>We study a typical binary event classification task in high-energy physics and comment on the performance and interpretability of KANs.
arXiv Detail & Related papers (2024-08-05T18:01:07Z) - WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - WAS-VTON: Warping Architecture Search for Virtual Try-on Network [57.52118202523266]
We introduce a NAS-Warping Module and elaborately design a bilevel hierarchical search space.
We learn a combination of repeatable warping cells and convolution operations specifically for the clothing-person alignment.
A NAS-Fusion Module is proposed to synthesize more natural final try-on results.
arXiv Detail & Related papers (2021-08-01T07:52:56Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.