CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding
- URL: http://arxiv.org/abs/2406.10505v1
- Date: Sat, 15 Jun 2024 04:54:56 GMT
- Title: CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding
- Authors: Libo Qin, Fuxuan Wei, Qiguang Chen, Jingxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che,
- Abstract summary: We present Cross-task Interactive Prompting (CroPrompt) for spoken language understanding (SLU)
CroPrompt enables the model to interactively leverage the information exchange across the correlated tasks in SLU.
We also introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection.
- Score: 40.75828713474074
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this problem, we present the pioneering work of Cross-task Interactive Prompting (CroPrompt) for SLU, which enables the model to interactively leverage the information exchange across the correlated tasks in SLU. Additionally, we further introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection. We conduct extensive experiments on the standard SLU benchmark and the results reveal that CroPrompt consistently outperforms the existing prompting approaches. In addition, the multi-task self-consistency mechanism can effectively ease the error propagation issue, thereby enhancing the performance. We hope this work can inspire more research on cross-task prompting for SLU.
Related papers
- Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model [45.161909551392085]
We propose finding task-specificworks within a multi-task spoken language understanding model via neural network pruning.
We show that pruned models were successful in adapting to additional ASR or IC data with minimal performance degradation on previously trained tasks.
arXiv Detail & Related papers (2024-06-18T06:39:41Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Towards ASR Robust Spoken Language Understanding Through In-Context
Learning With Word Confusion Networks [68.79880423713597]
We introduce a method that utilizes the ASR system's lattice output instead of relying solely on the top hypothesis.
Our in-context learning experiments, covering spoken question answering and intent classification, underline the LLM's resilience to noisy speech transcripts.
arXiv Detail & Related papers (2024-01-05T17:58:10Z) - ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for
Improving ASR Robustness in Spoken Language Understanding [55.39105863825107]
We propose Mutual Learning and Large-Margin Contrastive Learning (ML-LMCL) to improve automatic speech recognition (ASR) robustness.
In fine-tuning, we apply mutual learning and train two SLU models on the manual transcripts and the ASR transcripts, respectively.
Experiments on three datasets show that ML-LMCL outperforms existing models and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2023-11-19T16:53:35Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - A Scope Sensitive and Result Attentive Model for Multi-Intent Spoken
Language Understanding [18.988599232838766]
Multi-Intent Spoken Language Understanding (SLU) is attracting increasing attention.
Unlike traditional SLU, each intent in this scenario has its specific scope. Semantic information outside the scope even hinders the prediction.
We propose a novel Scope-Sensitive Result Attention Network (SSRAN) based on Transformer, which contains a Scope Recognizer (SR) and a Result Attention Network (RAN)
arXiv Detail & Related papers (2022-11-22T12:24:22Z) - STOP: A dataset for Spoken Task Oriented Semantic Parsing [66.14615249745448]
End-to-end spoken language understanding (SLU) predicts intent directly from audio using a single model.
We release the Spoken Task-Oriented semantic Parsing (STOP) dataset, the largest and most complex SLU dataset to be publicly available.
In addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.
arXiv Detail & Related papers (2022-06-29T00:36:34Z) - Building Robust Spoken Language Understanding by Cross Attention between
Phoneme Sequence and ASR Hypothesis [15.159439853075645]
This paper proposes a novel model with Cross Attention for SLU (denoted as CASLU)
The cross attention block is devised to catch the fine-grained interactions between phoneme and word embeddings in order to make the joint representations catch the phonetic and semantic features of input simultaneously.
Extensive experiments are conducted on three datasets, showing the effectiveness and competitiveness of our approach.
arXiv Detail & Related papers (2022-03-22T21:59:29Z) - A Result based Portable Framework for Spoken Language Understanding [15.99246711701726]
We propose a novel Result-based Portable Framework for Spoken language understanding (RPFSLU)
RPFSLU allows most existing single-turn SLU models to obtain the contextual information from multi-turn dialogues and takes full advantage of predicted results in the dialogue history during the current prediction.
Experimental results on the public dataset KVRET have shown that all SLU models in baselines acquire enhancement by RPFSLU on multi-turn SLU tasks.
arXiv Detail & Related papers (2021-03-10T12:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.