Related papers: Learning to Classify Intents and Slot Labels Given a Handful of Examples

Learning to Classify Intents and Slot Labels Given a Handful of Examples

URL: http://arxiv.org/abs/2004.10793v1
Date: Wed, 22 Apr 2020 18:54:38 GMT
Title: Learning to Classify Intents and Slot Labels Given a Handful of Examples
Authors: Jason Krone, Yi Zhang, Mona Diab
Abstract summary: Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems. We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios. We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark.
Score: 22.783338548129983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems. Current IC/SF models perform poorly when the number of training examples per class is small. We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios. We establish a few-shot IC/SF benchmark by defining few-shot splits for three public IC/SF datasets, ATIS, TOP, and Snips. We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark. Prototypical networks achieves significant gains in IC performance on the ATIS and TOP datasets, while both prototypical networks and MAML outperform the baseline with respect to SF on all three datasets. In addition, we demonstrate that joint training as well as the use of pre-trained language models, ELMo and BERT in our case, are complementary to these few-shot learning methods and yield further gains.

Related papers

Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples. We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning. Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models. To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence. Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z)
ILLUMINER: Instruction-tuned Large Language Models as Few-shot Intent Classifier and Slot Filler [1.9015367254988451]
This study evaluates instruction-tuned models (Instruct-LLMs) on popular benchmark datasets for intent classification (IC) and slot filling (SF) We introduce ILLUMINER, an approach framing IC and SF as language generation tasks for Instruct-LLMs, with a more efficient SF-prompting method compared to prior work. A comprehensive comparison with multiple baselines shows that our approach, using the FLAN-T5 11B model, outperforms the state-of-the-art joint IC+SF method and in-context learning with GPT3.5 (175B).
arXiv Detail & Related papers (2024-03-26T09:41:21Z)
Constructing Sample-to-Class Graph for Few-Shot Class-Incremental Learning [10.111587226277647]
Few-shot class-incremental learning (FSCIL) aims to build machine learning model that can continually learn new concepts from a few data samples. In this paper, we propose a Sample-to-Class (S2C) graph learning method for FSCIL.
arXiv Detail & Related papers (2023-10-31T08:38:14Z)
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models. Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z)
Unifying Synergies between Self-supervised Learning and Dynamic Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms. We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z)
An Explicit-Joint and Supervised-Contrastive Learning Framework for Few-Shot Intent Classification and Slot Filling [12.85364483952161]
Intent classification (IC) and slot filling (SF) are critical building blocks in task-oriented dialogue systems. Few IC/SF models perform well when the number of training samples per class is quite small. We propose a novel explicit-joint and supervised-contrastive learning framework for few-shot intent classification and slot filling.
arXiv Detail & Related papers (2021-10-26T13:28:28Z)
Semi-Supervised Few-Shot Intent Classification and Slot Filling [3.602651625446309]
Intent classification (IC) and slot filling (SF) are two fundamental tasks in modern Natural Language Understanding (NLU) systems. In this work, we investigate how contrastive learning and unsupervised data augmentation methods can benefit these existing supervised meta-learning pipelines.
arXiv Detail & Related papers (2021-09-17T20:26:23Z)
Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation. In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples. We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Meta learning to classify intent and slot labels with noisy few shot examples [11.835266162072486]
Spoken language understanding (SLU) models are notorious for being data-hungry. We propose a new SLU benchmarking task: few-shot robust SLU, where SLU comprises two core problems, intent classification (IC) and slot labeling (SL) We show the model consistently outperforms the conventional fine-tuning baseline and another popular meta-learning method, Model-Agnostic Meta-Learning (MAML), in terms of achieving better IC accuracy and SL F1.
arXiv Detail & Related papers (2020-11-30T18:53:30Z)
Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets. We introduce a novel training approach for existing FSC models. We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.