A Scope Sensitive and Result Attentive Model for Multi-Intent Spoken
Language Understanding
- URL: http://arxiv.org/abs/2211.12220v1
- Date: Tue, 22 Nov 2022 12:24:22 GMT
- Title: A Scope Sensitive and Result Attentive Model for Multi-Intent Spoken
Language Understanding
- Authors: Lizhi Cheng, Wenmian Yang, Weijia Jia
- Abstract summary: Multi-Intent Spoken Language Understanding (SLU) is attracting increasing attention.
Unlike traditional SLU, each intent in this scenario has its specific scope. Semantic information outside the scope even hinders the prediction.
We propose a novel Scope-Sensitive Result Attention Network (SSRAN) based on Transformer, which contains a Scope Recognizer (SR) and a Result Attention Network (RAN)
- Score: 18.988599232838766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Intent Spoken Language Understanding (SLU), a novel and more complex
scenario of SLU, is attracting increasing attention. Unlike traditional SLU,
each intent in this scenario has its specific scope. Semantic information
outside the scope even hinders the prediction, which tremendously increases the
difficulty of intent detection. More seriously, guiding slot filling with these
inaccurate intent labels suffers error propagation problems, resulting in
unsatisfied overall performance. To solve these challenges, in this paper, we
propose a novel Scope-Sensitive Result Attention Network (SSRAN) based on
Transformer, which contains a Scope Recognizer (SR) and a Result Attention
Network (RAN). Scope Recognizer assignments scope information to each token,
reducing the distraction of out-of-scope tokens. Result Attention Network
effectively utilizes the bidirectional interaction between results of slot
filling and intent detection, mitigating the error propagation problem.
Experiments on two public datasets indicate that our model significantly
improves SLU performance (5.4\% and 2.1\% on Overall accuracy) over the
state-of-the-art baseline.
Related papers
- Intent Detection in the Age of LLMs [3.755082744150185]
Intent detection is a critical component of task-oriented dialogue systems (TODS)
Traditional approaches relied on computationally efficient supervised sentence transformer encoder models.
The emergence of generative large language models (LLMs) with intrinsic world knowledge presents new opportunities to address these challenges.
arXiv Detail & Related papers (2024-10-02T15:01:55Z) - SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images [0.0]
This paper introduces a scale-robust complementary learning network (SCLNet) to address the scale challenges.
One implementation is based on our proposed scale-complementary decoder and scale-complementary loss function.
Another implementation is based on our proposed contrastive complement network and contrastive complement loss function.
arXiv Detail & Related papers (2024-09-11T05:39:25Z) - CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding [40.75828713474074]
We present Cross-task Interactive Prompting (CroPrompt) for spoken language understanding (SLU)
CroPrompt enables the model to interactively leverage the information exchange across the correlated tasks in SLU.
We also introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection.
arXiv Detail & Related papers (2024-06-15T04:54:56Z) - SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion
Constraint for Remote Sensing Image [21.596874679058327]
SwiMDiff is a novel self-supervised pre-training framework for remote sensing images.
It recalibrates labels to recognize data from the same scene as false negatives.
It seamlessly integrates contrastive learning (CL) with a diffusion model.
arXiv Detail & Related papers (2024-01-10T11:55:58Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.
We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset.
In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent
Detection and Slot Filling [69.59096090788125]
In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling.
We introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents.
Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information.
arXiv Detail & Related papers (2020-04-21T15:07:34Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.