A Generative Model for Joint Multiple Intent Detection and Slot Filling
- URL: http://arxiv.org/abs/2602.08322v2
- Date: Thu, 12 Feb 2026 05:47:30 GMT
- Title: A Generative Model for Joint Multiple Intent Detection and Slot Filling
- Authors: Liz Li, Wei Zhu,
- Abstract summary: In task-oriented dialogue systems, spoken language understanding (SLU) is a critical component, which consists of two sub-tasks, intent detection and slot filling.<n>Most existing methods focus on the single-intent SLU, where each utterance only has one intent.<n>In this paper, we propose a generative framework to simultaneously address multiple intent detection and slot filling.
- Score: 3.060720241524644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In task-oriented dialogue systems, spoken language understanding (SLU) is a critical component, which consists of two sub-tasks, intent detection and slot filling. Most existing methods focus on the single-intent SLU, where each utterance only has one intent. However, in real-world scenarios users usually express multiple intents in an utterance, which poses a challenge for existing dialogue systems and datasets. In this paper, we propose a generative framework to simultaneously address multiple intent detection and slot filling. In particular, an attention-over-attention decoder is proposed to handle the variable number of intents and the interference between the two sub-tasks by incorporating an inductive bias into the process of multi-task learning. Besides, we construct two new multi-intent SLU datasets based on single-intent utterances by taking advantage of the next sentence prediction (NSP) head of the BERT model. Experimental results demonstrate that our proposed attention-over-attention generative model achieves state-of-the-art performance on two public datasets, MixATIS and MixSNIPS, and our constructed datasets.
Related papers
- Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline [56.790045049514326]
Two major forms of deception dominate: human-crafted misinformation and AI-generated content.<n>We propose Unified Multimodal Fake Content Detection (UMFDet), a framework designed to handle both forms of deception.<n>UMFDet achieves robust and consistent performance across both misinformation types, outperforming specialized baselines.
arXiv Detail & Related papers (2025-09-30T09:26:32Z) - A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents [12.62162175115002]
This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual intent dataset.
We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets.
We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets.
arXiv Detail & Related papers (2024-10-29T19:10:12Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue [8.07809100513473]
This work investigates unsupervised approaches to overcome challenges in designing task-oriented dialog schema.
We postulate there are two salient factors for automatic induction of intents: (1) clustering algorithm for intent labeling and (2) user utterance embedding space.
Pretrained MiniLM with Agglomerative clustering shows significant improvement in NMI, ARI, F1, accuracy and example coverage in intent induction tasks.
arXiv Detail & Related papers (2022-12-05T04:37:22Z) - DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for
Multiple Intent Detection [27.787807111516706]
Instead of training a dedicated multi-intent detection model, we propose DialogUSR.
DialogUSR splits multi-intent user query into several single-intent sub-queries.
It then recovers all the coreferred and omitted information in the sub-queries.
arXiv Detail & Related papers (2022-10-20T13:56:35Z) - A Unified Framework for Multi-intent Spoken Language Understanding with
prompting [14.17726194025463]
We describe a Prompt-based Spoken Language Understanding (PromptSLU) framework, to intuitively unify two sub-tasks into the same form.
In detail, ID and SF are completed by concisely filling the utterance into task-specific prompt templates as input, and sharing output formats of key-value pairs sequence.
Experiment results show that our framework outperforms several state-of-the-art baselines on two public datasets.
arXiv Detail & Related papers (2022-10-07T05:58:05Z) - SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent
Detection and Slot Filling [26.037061005620263]
Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems.
We propose a multi-intent NLU framework, called SLIM, to jointly learn multi-intent detection and slot filling based on BERT.
arXiv Detail & Related papers (2021-08-26T11:33:39Z) - A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [61.109486326954205]
Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system.
Previous studies either model the two tasks separately or only consider the single information flow from intent to slot.
We propose a Co-Interactive Transformer to consider the cross-impact between the two tasks simultaneously.
arXiv Detail & Related papers (2020-10-08T10:16:52Z) - AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent
Detection and Slot Filling [69.59096090788125]
In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling.
We introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents.
Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information.
arXiv Detail & Related papers (2020-04-21T15:07:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.