Multi-Intent Spoken Language Understanding: Methods, Trends, and Challenges
- URL: http://arxiv.org/abs/2512.11258v1
- Date: Fri, 12 Dec 2025 03:46:39 GMT
- Title: Multi-Intent Spoken Language Understanding: Methods, Trends, and Challenges
- Authors: Di Wu, Ruiyu Fang, Liting Jiang, Shuangyong Song, Xiaomeng Huang, Shiquan Wang, Zhongqiu Li, Lingling Shi, Mengjiao Bao, Yongxiang Li, Hao Huang,
- Abstract summary: Multi-intent spoken language understanding involves two tasks: multiple intent detection and slot filling.<n>There remains a lack of a comprehensive and systematic review of existing studies on multi-intent SLU.
- Score: 21.520532115690504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-intent spoken language understanding (SLU) involves two tasks: multiple intent detection and slot filling, which jointly handle utterances containing more than one intent. Owing to this characteristic, which closely reflects real-world applications, the task has attracted increasing research attention, and substantial progress has been achieved. However, there remains a lack of a comprehensive and systematic review of existing studies on multi-intent SLU. To this end, this paper presents a survey of recent advances in multi-intent SLU. We provide an in-depth overview of previous research from two perspectives: decoding paradigms and modeling approaches. On this basis, we further compare the performance of representative models and analyze their strengths and limitations. Finally, we discuss the current challenges and outline promising directions for future research. We hope this survey will offer valuable insights and serve as a useful reference for advancing research in multi-intent SLU.
Related papers
- Deep Research: A Systematic Survey [118.82795024422722]
Deep Research (DR) aims to combine the reasoning capabilities of large language models with external tools, such as search engines.<n>This survey presents a comprehensive and systematic overview of deep research systems.
arXiv Detail & Related papers (2025-11-24T15:28:28Z) - From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models [66.36007274540113]
Multimodal Large Language Models (MLLMs) strive to achieve a profound, human-like understanding of and interaction with the physical world.<n>They often exhibit a shallow and incoherent integration when acquiring information (Perception) and conducting reasoning (Cognition)<n>This survey introduces a novel and unified analytical framework: From Perception to Cognition"
arXiv Detail & Related papers (2025-09-29T18:25:40Z) - What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration [59.855712519568904]
We investigate the three core steps of MM-ICL including demonstration retrieval, demonstration ordering, and prompt construction.
Our findings highlight the necessity of a multi-modal retriever for demonstration retrieval, and the importance of intra-demonstration ordering over inter-demonstration ordering.
arXiv Detail & Related papers (2024-10-27T15:37:51Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models [52.24001776263608]
This comprehensive survey delves into the recent strides in HS moderation.
We highlight the burgeoning role of large language models (LLMs) and large multimodal models (LMMs)
We identify existing gaps in research, particularly in the context of underrepresented languages and cultures.
arXiv Detail & Related papers (2024-01-30T03:51:44Z) - Multi-agent Reinforcement Learning: A Comprehensive Survey [10.186029242664931]
Multi-agent systems (MAS) are widely prevalent and crucially important in numerous real-world applications.
Despite their ubiquity, the development of intelligent decision-making agents in MAS poses several open challenges to their effective implementation.
This survey examines these challenges, placing an emphasis on studying seminal concepts from game theory (GT) and machine learning (ML)
arXiv Detail & Related papers (2023-12-15T23:16:54Z) - A Survey on Interpretable Cross-modal Reasoning [64.37362731950843]
Cross-modal reasoning (CMR) has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.
This survey delves into the realm of interpretable cross-modal reasoning (I-CMR)
This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR.
arXiv Detail & Related papers (2023-09-05T05:06:48Z) - A Survey on Spoken Language Understanding: Recent Advances and New
Frontiers [35.59678070422133]
Spoken Language Understanding (SLU) aims to extract the semantics frame of user queries.
With the burst of deep neural networks and the evolution of pre-trained language models, the research of SLU has obtained significant breakthroughs.
arXiv Detail & Related papers (2021-03-04T15:22:00Z) - Multimodal Research in Vision and Language: A Review of Current and
Emerging Trends [41.07256031348454]
We present a detailed overview of the latest trends in research pertaining to visual and language modalities.
We look at its applications in their task formulations and how to solve various problems related to semantic perception and content generation.
We shed some light on multi-disciplinary patterns and insights that have emerged in the recent past, directing this field towards more modular and transparent intelligent systems.
arXiv Detail & Related papers (2020-10-19T13:55:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.