Joint Multiple Intent Detection and Slot Filling via Self-distillation
- URL: http://arxiv.org/abs/2108.08042v1
- Date: Wed, 18 Aug 2021 08:45:03 GMT
- Title: Joint Multiple Intent Detection and Slot Filling via Self-distillation
- Authors: Lisong Chen, Peilin Zhou and Yuexian Zou
- Abstract summary: Intent detection and slot filling are two main tasks in natural language understanding (NLU) for identifying users' needs from their utterances.
Most previous works assume that each utterance only corresponds to one intent, ignoring the fact that a user utterance in many cases could include multiple intents.
We propose a novel Self-Distillation Joint NLU model (SDJN) for multi-intent NLU.
- Score: 29.17761742391222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intent detection and slot filling are two main tasks in natural language
understanding (NLU) for identifying users' needs from their utterances. These
two tasks are highly related and often trained jointly. However, most previous
works assume that each utterance only corresponds to one intent, ignoring the
fact that a user utterance in many cases could include multiple intents. In
this paper, we propose a novel Self-Distillation Joint NLU model (SDJN) for
multi-intent NLU. First, we formulate multiple intent detection as a weakly
supervised problem and approach with multiple instance learning (MIL). Then, we
design an auxiliary loop via self-distillation with three orderly arranged
decoders: Initial Slot Decoder, MIL Intent Decoder, and Final Slot Decoder. The
output of each decoder will serve as auxiliary information for the next
decoder. With the auxiliary knowledge provided by the MIL Intent Decoder, we
set Final Slot Decoder as the teacher model that imparts knowledge back to
Initial Slot Decoder to complete the loop. The auxiliary loop enables intents
and slots to guide mutually in-depth and further boost the overall NLU
performance. Experimental results on two public multi-intent datasets indicate
that our model achieves strong performance compared to others.
Related papers
- Slot Induction via Pre-trained Language Model Probing and Multi-level
Contrastive Learning [62.839109775887025]
Slot Induction (SI) task whose objective is to induce slot boundaries without explicit knowledge of token-level slot annotations.
We propose leveraging Unsupervised Pre-trained Language Model (PLM) Probing and Contrastive Learning mechanism to exploit unsupervised semantic knowledge extracted from PLM.
Our approach is shown to be effective in SI task and capable of bridging the gaps with token-level supervised models on two NLU benchmark datasets.
arXiv Detail & Related papers (2023-08-09T05:08:57Z) - Think Twice before Driving: Towards Scalable Decoders for End-to-End
Autonomous Driving [74.28510044056706]
Existing methods usually adopt the decoupled encoder-decoder paradigm.
In this work, we aim to alleviate the problem by two principles.
We first predict a coarse-grained future position and action based on the encoder features.
Then, conditioned on the position and action, the future scene is imagined to check the ramification if we drive accordingly.
arXiv Detail & Related papers (2023-05-10T15:22:02Z) - Explainable Slot Type Attentions to Improve Joint Intent Detection and
Slot Filling [39.22929726787844]
Joint intent detection and slot filling is a key research topic in natural language understanding (NLU)
Existing joint intent and slot filling systems analyze and compute features collectively for all slot types.
We propose a novel approach that: (i) learns to generate additional slot type specific features in order to improve accuracy and (ii) provides explanations for slot filling decisions for the first time in a joint NLU model.
arXiv Detail & Related papers (2022-10-19T00:56:10Z) - String-based Molecule Generation via Multi-decoder VAE [56.465033997245776]
We investigate the problem of string-based molecular generation via variational autoencoders (VAEs)
We propose a simple, yet effective idea to improve the performance of VAE for the task.
In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.
arXiv Detail & Related papers (2022-08-23T03:56:30Z) - Safe Multi-Task Learning [3.508126539399186]
We propose a Safe Multi-Task Learning (SMTL) model, which consists of a public encoder shared by all the tasks, private encoders, gates, and private decoders.
To reduce the storage cost during the inference stage, a lite version of SMTL is proposed to allow the gate to choose either the public encoder or the corresponding private encoder.
arXiv Detail & Related papers (2021-11-20T14:21:02Z) - SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent
Detection and Slot Filling [26.037061005620263]
Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems.
We propose a multi-intent NLU framework, called SLIM, to jointly learn multi-intent detection and slot filling based on BERT.
arXiv Detail & Related papers (2021-08-26T11:33:39Z) - Scheduled Sampling in Vision-Language Pretraining with Decoupled
Encoder-Decoder Network [99.03895740754402]
We propose a two-stream decoupled design of encoder-decoder structure, in which two decoupled cross-modal encoder and decoder are involved.
As an alternative, we propose a primary scheduled sampling strategy that mitigates such discrepancy via pretraining encoder-decoder in a two-pass manner.
arXiv Detail & Related papers (2021-01-27T17:36:57Z) - Two are Better than One: Joint Entity and Relation Extraction with
Table-Sequence Encoders [13.999110725631672]
Two different encoders are designed to help each other in the representation learning process.
Our experiments confirm the advantages of having em two encoders over em one encoder.
arXiv Detail & Related papers (2020-10-08T09:10:55Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z) - Efficient Intent Detection with Dual Sentence Encoders [53.16532285820849]
We introduce intent detection methods backed by pretrained dual sentence encoders such as USE and ConveRT.
We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that they outperform intent detectors based on fine-tuning the full BERT-Large model.
We release our code, as well as a new challenging single-domain intent detection dataset.
arXiv Detail & Related papers (2020-03-10T15:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.