Self-Attention Networks for Intent Detection
- URL: http://arxiv.org/abs/2006.15585v1
- Date: Sun, 28 Jun 2020 12:19:15 GMT
- Title: Self-Attention Networks for Intent Detection
- Authors: Sevinj Yolchuyeva, G\'eza N\'emeth, B\'alint Gyires-T\'oth
- Abstract summary: We present a novel intent detection system based on a self-attention network and a Bi-LSTM.
Our approach shows improvement by using a transformer model and deep averaging network-based universal sentence encoder.
We evaluate the system on Snips, Smart Speaker, Smart Lights, and ATIS datasets by different evaluation metrics.
- Score: 0.9023847175654603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-attention networks (SAN) have shown promising performance in various
Natural Language Processing (NLP) scenarios, especially in machine translation.
One of the main points of SANs is the strength of capturing long-range and
multi-scale dependencies from the data. In this paper, we present a novel
intent detection system which is based on a self-attention network and a
Bi-LSTM. Our approach shows improvement by using a transformer model and deep
averaging network-based universal sentence encoder compared to previous
solutions. We evaluate the system on Snips, Smart Speaker, Smart Lights, and
ATIS datasets by different evaluation metrics. The performance of the proposed
model is compared with LSTM with the same datasets.
Related papers
- SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion Recognition [3.4355593397388597]
Speech emotion recognition (SER) has made significant strides with the advent of powerful self-supervised learning (SSL) models.
We propose a large-scale benchmark to evaluate the robustness and adaptability of state-of-the-art SER models.
We find that the Whisper model, primarily designed for automatic speech recognition, outperforms dedicated SSL models in cross-lingual SER.
arXiv Detail & Related papers (2024-08-14T23:33:10Z) - Feature Aggregation in Joint Sound Classification and Localization
Neural Networks [0.0]
Current state-of-the-art sound source localization deep learning networks lack feature aggregation within their architecture.
We adapt feature aggregation techniques from computer vision neural networks to signal detection neural networks.
arXiv Detail & Related papers (2023-10-29T16:37:14Z) - Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues.
PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target.
Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Parallel Successive Learning for Dynamic Distributed Model Training over
Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices.
We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions.
Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z) - Streaming Multi-Talker ASR with Token-Level Serialized Output Training [53.11450530896623]
t-SOT is a novel framework for streaming multi-talker automatic speech recognition.
The t-SOT model has the advantages of less inference cost and a simpler model architecture.
For non-overlapping speech, the t-SOT model is on par with a single-talker ASR model in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2022-02-02T01:27:21Z) - An Explainable Machine Learning-based Network Intrusion Detection System
for Enabling Generalisability in Securing IoT Networks [0.0]
Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the security posture of an organisation.
Many systems have been designed and developed in the research community, often achieving a perfect detection rate when evaluated using certain datasets.
This paper tightens the gap by evaluating the generalisability of a common feature set to different network environments and attack types.
arXiv Detail & Related papers (2021-04-15T00:44:45Z) - AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic
Segmentation [8.883837682023493]
We present an Assertion-based Multi-View Fusion network (AMVNet) for LiDAR semantic segmentation.
We perform assertion-guided point sampling on score disagreements and pass a set of point-level features for each sampled point to a simple point head which refines the predictions.
Our approach outperforms the baseline method of combining the class scores of the projection-based networks.
arXiv Detail & Related papers (2020-12-09T09:34:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.