Related papers: Interpretability of the Intent Detection Problem: A New Approach

Interpretability of the Intent Detection Problem: A New Approach

URL: http://arxiv.org/abs/2601.17156v1
Date: Fri, 23 Jan 2026 20:27:47 GMT
Title: Interpretability of the Intent Detection Problem: A New Approach
Authors: Eduardo Sanchez-Karhunen, Jose F. Quesada-Moreno, Miguel A. Gutiérrez-Naranjo,
Abstract summary: Internal mechanisms enabling Recurrent Neural Networks to solve intent detection tasks are poorly understood.<n>We apply dynamical systems theory to analyze how RNN architectures address this problem.<n>Our framework decouples geometric separation from readout alignment, providing a novel, mechanistic explanation for real world performance disparities.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Intent detection, a fundamental text classification task, aims to identify and label the semantics of user queries, playing a vital role in numerous business applications. Despite the dominance of deep learning techniques in this field, the internal mechanisms enabling Recurrent Neural Networks (RNNs) to solve intent detection tasks are poorly understood. In this work, we apply dynamical systems theory to analyze how RNN architectures address this problem, using both the balanced SNIPS and the imbalanced ATIS datasets. By interpreting sentences as trajectories in the hidden state space, we first show that on the balanced SNIPS dataset, the network learns an ideal solution: the state space, constrained to a low-dimensional manifold, is partitioned into distinct clusters corresponding to each intent. The application of this framework to the imbalanced ATIS dataset then reveals how this ideal geometric solution is distorted by class imbalance, causing the clusters for low-frequency intents to degrade. Our framework decouples geometric separation from readout alignment, providing a novel, mechanistic explanation for real world performance disparities. These findings provide new insights into RNN dynamics, offering a geometric interpretation of how dataset properties directly shape a network's computational solution.

Related papers

Debugging and Runtime Analysis of Neural Networks with VLMs (A Case Study) [20.420310876464924]
We show the utility of semantic heatmaps for fault localization in vision models.<n>We propose a lightweight runtime analysis to detect and filter-out defects at runtime.
arXiv Detail & Related papers (2025-03-21T01:12:57Z)
Set-Valued Sensitivity Analysis of Deep Neural Networks [7.249038561506896]
This paper proposes a sensitivity analysis framework based on set valued mapping for deep neural networks (DNN)<n>By developing set-level metrics such as distance between sets, convergence of sets, derivatives of set-valued mapping, and stability across the solution set, we prove that the solution set of the Fully Connected Neural Network holds Lipschitz-like properties.
arXiv Detail & Related papers (2024-12-15T05:22:38Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Steinmetz Neural Networks for Complex-Valued Data [23.80312814400945]
We introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valuedworks with coupled outputs.<n>Our proposed class of architectures, referred to as Steinmetz Neural Networks, incorporates multi-view learning to construct more interpretable representations in the latent space.<n>Our numerical experiments depict the improved performance and robustness to additive noise, afforded by our proposed networks on benchmark datasets and synthetic examples.
arXiv Detail & Related papers (2024-09-16T08:26:06Z)
Interpretation of the Intent Detection Problem as Dynamics in a Low-dimensional Space [0.0]
In this work, we investigate how different RNN architectures solve the SNIPS intent detection problem. To generate predictions, RNN steers the trajectories towards concrete regions, spatially aligned with the output layer matrix rows directions. Our results provide new insights into the inner workings of networks that solve the intent detection task.
arXiv Detail & Related papers (2024-08-05T21:22:36Z)
Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications. In this paper we propose an uncertainty quantification approach by modelling the distribution of features. We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem. We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z)
Optimal Transport Based Refinement of Physics-Informed Neural Networks [0.0]
We propose a refinement strategy to the well-known Physics-Informed Neural Networks (PINNs) for solving partial differential equations (PDEs) based on the concept of Optimal Transport (OT) PINNs solvers have been found to suffer from a host of issues: spectral bias in fully-connected pathologies, unstable gradient, and difficulties with convergence and accuracy. We present a novel training strategy for solving the Fokker-Planck-Kolmogorov Equation (FPKE) using OT-based sampling to supplement the existing PINNs framework.
arXiv Detail & Related papers (2021-05-26T02:51:20Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty [91.0564497403256]
We present a novel framework that involves probabilistic fusion between the two families of predictions during network training. Our network features a self-attention graph neural network, which drives the learning by enforcing strong interactions between different correspondences. We propose motion parmeterizations suitable for learning and show that our method achieves state-of-the-art performance on the challenging DeMoN and ScanNet datasets.
arXiv Detail & Related papers (2021-04-16T17:59:06Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.