Compositional Generalization in Spoken Language Understanding
- URL: http://arxiv.org/abs/2312.15815v1
- Date: Mon, 25 Dec 2023 21:46:06 GMT
- Title: Compositional Generalization in Spoken Language Understanding
- Authors: Avik Ray, Yilin Shen, Hongxia Jin
- Abstract summary: We study two types of compositionality: (a) novel slot combination, and (b) length generalization.
We show that our compositional SLU model significantly outperforms state-of-the-art BERT SLU model.
- Score: 58.609624319953156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art spoken language understanding (SLU) models have shown
tremendous success in benchmark SLU datasets, yet they still fail in many
practical scenario due to the lack of model compositionality when trained on
limited training data. In this paper, we study two types of compositionality:
(a) novel slot combination, and (b) length generalization. We first conduct
in-depth analysis, and find that state-of-the-art SLU models often learn
spurious slot correlations during training, which leads to poor performance in
both compositional cases. To mitigate these limitations, we create the first
compositional splits of benchmark SLU datasets and we propose the first
compositional SLU model, including compositional loss and paired training that
tackle each compositional case respectively. On both benchmark and
compositional splits in ATIS and SNIPS, we show that our compositional SLU
model significantly outperforms (up to $5\%$ F1 score) state-of-the-art BERT
SLU model.
Related papers
- Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline.
Our work starts with a benchmarking of existing LLM scaling techniques, especially selective merging, and variants of mixture.
Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters through a model mixture.
arXiv Detail & Related papers (2024-10-07T15:55:55Z) - Cross-composition Feature Disentanglement for Compositional Zero-shot Learning [49.919635694894204]
Disentanglement of visual features of primitives (i.e., attributes and objects) has shown exceptional results in Compositional Zero-shot Learning (CZSL)
We propose the solution of cross-composition feature disentanglement, which takes multiple primitive-sharing compositions as inputs and constrains the disentangled primitive features to be general across these compositions.
arXiv Detail & Related papers (2024-08-19T08:23:09Z) - From Words to Worlds: Compositionality for Cognitive Architectures [45.254578970023196]
Large language models (LLMs) are very performant connectionist systems, but do they exhibit more compositionality?
We present empirical analyses across four LLM families and three task categories, including a novel task introduced below.
arXiv Detail & Related papers (2024-07-18T11:42:13Z) - Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Prompting Language-Informed Distribution for Compositional Zero-Shot Learning [73.49852821602057]
Compositional zero-shot learning (CZSL) task aims to recognize unseen compositional visual concepts.
We propose a model by prompting the language-informed distribution, aka., PLID, for the task.
Experimental results on MIT-States, UT-Zappos, and C-GQA datasets show the superior performance of the PLID to the prior arts.
arXiv Detail & Related papers (2023-05-23T18:00:22Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models
for Spoken Language Understanding [42.345266746904514]
We employ four types of pre-trained models and their combinations for spoken language understanding (SLU)
We leverage self-supervised speech and language models (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations.
We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora.
arXiv Detail & Related papers (2022-11-10T20:59:13Z) - Reference-Limited Compositional Zero-Shot Learning [19.10692212692771]
Compositional zero-shot learning (CZSL) refers to recognizing unseen compositions of known visual primitives.
We propose a novel Meta Compositional Graph Learner (MetaCGL) that can efficiently learn the compositionality from insufficient referential information.
arXiv Detail & Related papers (2022-08-22T03:58:02Z) - Meta learning to classify intent and slot labels with noisy few shot
examples [11.835266162072486]
Spoken language understanding (SLU) models are notorious for being data-hungry.
We propose a new SLU benchmarking task: few-shot robust SLU, where SLU comprises two core problems, intent classification (IC) and slot labeling (SL)
We show the model consistently outperforms the conventional fine-tuning baseline and another popular meta-learning method, Model-Agnostic Meta-Learning (MAML), in terms of achieving better IC accuracy and SL F1.
arXiv Detail & Related papers (2020-11-30T18:53:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.