MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU
- URL: http://arxiv.org/abs/2408.08144v3
- Date: Fri, 30 May 2025 05:51:10 GMT
- Title: MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU
- Authors: Yan Li, So-Eon Kim, Seong-Bae Park, Soyeon Caren Han,
- Abstract summary: MIDAS is a novel approach leveraging multi-level intent, domain, and slot knowledge distillation for multi-turn NLU.<n>We construct distinct teachers for SI detection, WS filling, and conversation-level domain (CD) classification, each fine-tuned for specific knowledge.<n>Results demonstrate the efficacy of our model in improving multi-turn conversation understanding.
- Score: 9.047800457694656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although Large Language Models (LLMs) can generate coherent text, they often struggle to recognise user intent behind queries. In contrast, Natural Language Understanding (NLU) models interpret the purpose and key information of user input for responsive interactions. Existing NLU models typically map utterances to a dual-level semantic frame, involving sentence-level intent (SI) and word-level slot (WS) labels. However, real-life conversations primarily consist of multi-turn dialogues, requiring the interpretation of complex and extended exchanges. Researchers encounter challenges in addressing all facets of multi-turn dialogue using a unified NLU model. This paper introduces MIDAS, a novel approach leveraging multi-level intent, domain, and slot knowledge distillation for multi-turn NLU. We construct distinct teachers for SI detection, WS filling, and conversation-level domain (CD) classification, each fine-tuned for specific knowledge. A multi-teacher loss is proposed to facilitate the integration of these teachers, guiding a student model in multi-turn dialogue tasks. Results demonstrate the efficacy of our model in improving multi-turn conversation understanding, showcasing the potential for advancements in NLU through multi-level dialogue knowledge distillation. Our implementation is open-sourced on https://github.com/adlnlp/Midas.
Related papers
- Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning [32.95008932216176]
We introduce MMDiag, a multi-turn multimodal dialogue dataset.<n>We also present DiagNote, an MLLM equipped with multimodal grounding and reasoning capabilities.
arXiv Detail & Related papers (2025-03-10T07:32:53Z) - Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification [6.459396785817196]
Chain-of-Intent generates intent-driven conversations through self-play.
MINT-CL is a framework for multi-turn intent classification using multi-task contrastive learning.
We release MINT-E, a multilingual, intent-aware multi-turn e-commerce dialogue corpus.
arXiv Detail & Related papers (2024-11-21T15:59:29Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - DivTOD: Unleashing the Power of LLMs for Diversifying Task-Oriented Dialogue Representations [21.814490079113323]
Language models pre-trained on general text have achieved impressive results in diverse fields.
Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language models.
We propose a novel dialogue pre-training model called DivTOD, which collaborates with LLMs to learn diverse task-oriented dialogue representations.
arXiv Detail & Related papers (2024-03-31T04:36:57Z) - DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Joint Modelling of Spoken Language Understanding Tasks with Integrated
Dialog History [30.20353302347147]
We propose a novel model architecture that learns dialog context to jointly predict the intent, dialog act, speaker role, and emotion for the spoken utterance.
Our experiments show that our joint model achieves similar results to task-specific classifiers.
arXiv Detail & Related papers (2023-05-01T16:26:18Z) - A Mixture-of-Expert Approach to RL-based Dialogue Management [56.08449336469477]
We use reinforcement learning to develop a dialogue agent that avoids being short-sighted (outputting generic utterances) and maximizes overall user satisfaction.
Most existing RL approaches to DM train the agent at the word-level, and thus, have to deal with aly complex action space even for a medium-size vocabulary.
We develop a RL-based DM using a novel mixture of expert language model (MoE-LM) that consists of (i) a LM capable of learning diverse semantics for conversation histories, (ii) a number of specialized LMs (or experts) capable of generating utterances corresponding to a
arXiv Detail & Related papers (2022-05-31T19:00:41Z) - NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural
Language Understanding in Task-Oriented Dialogue [53.54788957697192]
NLU++ is a novel dataset for natural language understanding (NLU) in task-oriented dialogue (ToD) systems.
NLU++ is divided into two domains (BANKING and HOTELS) and brings several crucial improvements over current commonly used NLU datasets.
arXiv Detail & Related papers (2022-04-27T16:00:23Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - Knowledge Augmented BERT Mutual Network in Multi-turn Spoken Dialogues [6.4144180888492075]
We propose to equip a BERT-based joint model with a knowledge attention module to mutually leverage dialogue contexts between two SLU tasks.
A gating mechanism is further utilized to filter out irrelevant knowledge triples and to circumvent distracting comprehension.
Experimental results in two complicated multi-turn dialogue datasets have demonstrate by mutually modeling two SLU tasks with filtered knowledge and dialogue contexts.
arXiv Detail & Related papers (2022-02-23T04:03:35Z) - A Context-Aware Hierarchical BERT Fusion Network for Multi-turn Dialog
Act Detection [6.361198391681688]
CaBERT-SLU is a context-aware hierarchical BERT fusion Network (CaBERT-SLU)
Our approach reaches new state-of-the-art (SOTA) performances in two complicated multi-turn dialogue datasets.
arXiv Detail & Related papers (2021-09-03T02:00:03Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.