Meta learning to classify intent and slot labels with noisy few shot
examples
- URL: http://arxiv.org/abs/2012.07516v1
- Date: Mon, 30 Nov 2020 18:53:30 GMT
- Title: Meta learning to classify intent and slot labels with noisy few shot
examples
- Authors: Shang-Wen Li, Jason Krone, Shuyan Dong, Yi Zhang, and Yaser Al-onaizan
- Abstract summary: Spoken language understanding (SLU) models are notorious for being data-hungry.
We propose a new SLU benchmarking task: few-shot robust SLU, where SLU comprises two core problems, intent classification (IC) and slot labeling (SL)
We show the model consistently outperforms the conventional fine-tuning baseline and another popular meta-learning method, Model-Agnostic Meta-Learning (MAML), in terms of achieving better IC accuracy and SL F1.
- Score: 11.835266162072486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently deep learning has dominated many machine learning areas, including
spoken language understanding (SLU). However, deep learning models are
notorious for being data-hungry, and the heavily optimized models are usually
sensitive to the quality of the training examples provided and the consistency
between training and inference conditions. To improve the performance of SLU
models on tasks with noisy and low training resources, we propose a new SLU
benchmarking task: few-shot robust SLU, where SLU comprises two core problems,
intent classification (IC) and slot labeling (SL). We establish the task by
defining few-shot splits on three public IC/SL datasets, ATIS, SNIPS, and TOP,
and adding two types of natural noises (adaptation example missing/replacing
and modality mismatch) to the splits. We further propose a novel noise-robust
few-shot SLU model based on prototypical networks. We show the model
consistently outperforms the conventional fine-tuning baseline and another
popular meta-learning method, Model-Agnostic Meta-Learning (MAML), in terms of
achieving better IC accuracy and SL F1, and yielding smaller performance
variation when noises are present.
Related papers
- Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning [50.1035273069458]
Spoken language understanding (SLU) is a core task in task-oriented dialogue systems.
We propose a multi-level MMCL framework to apply contrastive learning at three levels, including utterance level, slot level, and word level.
Our framework achieves new state-of-the-art results on two public multi-intent SLU datasets.
arXiv Detail & Related papers (2024-05-31T14:34:23Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Compositional Generalization in Spoken Language Understanding [58.609624319953156]
We study two types of compositionality: (a) novel slot combination, and (b) length generalization.
We show that our compositional SLU model significantly outperforms state-of-the-art BERT SLU model.
arXiv Detail & Related papers (2023-12-25T21:46:06Z) - A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models
for Spoken Language Understanding [42.345266746904514]
We employ four types of pre-trained models and their combinations for spoken language understanding (SLU)
We leverage self-supervised speech and language models (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations.
We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora.
arXiv Detail & Related papers (2022-11-10T20:59:13Z) - CLUES: Few-Shot Learning Evaluation in Natural Language Understanding [81.63968985419982]
We introduce CLUES, a benchmark for evaluating the few-shot learning capabilities of NLU models.
We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.
arXiv Detail & Related papers (2021-11-04T00:43:15Z) - A Strong Baseline for Semi-Supervised Incremental Few-Shot Learning [54.617688468341704]
Few-shot learning aims to learn models that generalize to novel classes with limited training samples.
We propose a novel paradigm containing two parts: (1) a well-designed meta-training algorithm for mitigating ambiguity between base and novel classes caused by unreliable pseudo labels and (2) a model adaptation mechanism to learn discriminative features for novel classes while preserving base knowledge using few labeled and all the unlabeled data.
arXiv Detail & Related papers (2021-10-21T13:25:52Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Adaptive Name Entity Recognition under Highly Unbalanced Data [5.575448433529451]
We present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks.
We introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set.
arXiv Detail & Related papers (2020-03-10T06:56:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.