A Result based Portable Framework for Spoken Language Understanding
- URL: http://arxiv.org/abs/2103.06010v1
- Date: Wed, 10 Mar 2021 12:06:26 GMT
- Title: A Result based Portable Framework for Spoken Language Understanding
- Authors: Lizhi Cheng, Weijia Jia, Wenmian Yang
- Abstract summary: We propose a novel Result-based Portable Framework for Spoken language understanding (RPFSLU)
RPFSLU allows most existing single-turn SLU models to obtain the contextual information from multi-turn dialogues and takes full advantage of predicted results in the dialogue history during the current prediction.
Experimental results on the public dataset KVRET have shown that all SLU models in baselines acquire enhancement by RPFSLU on multi-turn SLU tasks.
- Score: 15.99246711701726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spoken language understanding (SLU), which is a core component of the
task-oriented dialogue system, has made substantial progress in the research of
single-turn dialogue. However, the performance in multi-turn dialogue is still
not satisfactory in the sense that the existing multi-turn SLU methods have low
portability and compatibility for other single-turn SLU models. Further,
existing multi-turn SLU methods do not exploit the historical predicted results
when predicting the current utterance, which wastes helpful information. To gap
those shortcomings, in this paper, we propose a novel Result-based Portable
Framework for SLU (RPFSLU). RPFSLU allows most existing single-turn SLU models
to obtain the contextual information from multi-turn dialogues and takes full
advantage of predicted results in the dialogue history during the current
prediction. Experimental results on the public dataset KVRET have shown that
all SLU models in baselines acquire enhancement by RPFSLU on multi-turn SLU
tasks.
Related papers
- CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding [40.75828713474074]
We present Cross-task Interactive Prompting (CroPrompt) for spoken language understanding (SLU)
CroPrompt enables the model to interactively leverage the information exchange across the correlated tasks in SLU.
We also introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection.
arXiv Detail & Related papers (2024-06-15T04:54:56Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Towards ASR Robust Spoken Language Understanding Through In-Context
Learning With Word Confusion Networks [68.79880423713597]
We introduce a method that utilizes the ASR system's lattice output instead of relying solely on the top hypothesis.
Our in-context learning experiments, covering spoken question answering and intent classification, underline the LLM's resilience to noisy speech transcripts.
arXiv Detail & Related papers (2024-01-05T17:58:10Z) - Integrating Pretrained ASR and LM to Perform Sequence Generation for
Spoken Language Understanding [29.971414483624823]
We propose a three-pass end-to-end (E2E) SLU system that effectively integrates ASR and LMworks into the SLU formulation for sequence generation tasks.
Our proposed three-pass SLU system shows improved performance over cascaded and E2E SLU models on two benchmark SLU datasets.
arXiv Detail & Related papers (2023-07-20T16:34:40Z) - Improving Textless Spoken Language Understanding with Discrete Units as
Intermediate Target [58.59044226658916]
Spoken Language Understanding (SLU) is a task that aims to extract semantic information from spoken utterances.
We propose to use discrete units as intermediate guidance to improve textless SLU performance.
arXiv Detail & Related papers (2023-05-29T14:00:24Z) - STOP: A dataset for Spoken Task Oriented Semantic Parsing [66.14615249745448]
End-to-end spoken language understanding (SLU) predicts intent directly from audio using a single model.
We release the Spoken Task-Oriented semantic Parsing (STOP) dataset, the largest and most complex SLU dataset to be publicly available.
In addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.
arXiv Detail & Related papers (2022-06-29T00:36:34Z) - Capture Salient Historical Information: A Fast and Accurate
Non-Autoregressive Model for Multi-turn Spoken Language Understanding [18.988599232838766]
Existing work increases inference speed by designing non-autoregressive models for single-turn Spoken Language Understanding tasks.
We propose a novel model for multi-turn SLU named Salient History Attention with Layer-Refined Transformer (SHA-LRT)
SHA captures historical information for the current dialogue from both historical utterances and results via a well-designed history-attention mechanism.
arXiv Detail & Related papers (2022-06-24T10:45:32Z) - Deliberation Model for On-Device Spoken Language Understanding [69.5587671262691]
We propose a novel deliberation-based approach to end-to-end (E2E) spoken language understanding (SLU)
We show that our approach can significantly reduce the degradation when moving from natural speech to synthetic speech training.
arXiv Detail & Related papers (2022-04-04T23:48:01Z) - ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet [95.39817519115394]
ESPnet-SLU is a project inside end-to-end speech processing toolkit, ESPnet.
It is designed for quick development of spoken language understanding in a single framework.
arXiv Detail & Related papers (2021-11-29T17:05:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.