The Devil is in the Details: On Models and Training Regimes for Few-Shot
Intent Classification
- URL: http://arxiv.org/abs/2210.06440v1
- Date: Wed, 12 Oct 2022 17:37:54 GMT
- Title: The Devil is in the Details: On Models and Training Regimes for Few-Shot
Intent Classification
- Authors: Mohsen Mesgar, Thy Thy Tran, Goran Glavas, Iryna Gurevych
- Abstract summary: Few-shot Classification (FSIC) is one of the key challenges in modular task-oriented dialog systems.
We show that cross-encoder architecture and episodic meta-learning consistently yields the best FSIC performance.
Our findings pave the way for conducting state-of-the-art research in FSIC.
- Score: 81.60168035505039
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Few-shot Intent Classification (FSIC) is one of the key challenges in modular
task-oriented dialog systems. While advanced FSIC methods are similar in using
pretrained language models to encode texts and nearest neighbour-based
inference for classification, these methods differ in details. They start from
different pretrained text encoders, use different encoding architectures with
varying similarity functions, and adopt different training regimes. Coupling
these mostly independent design decisions and the lack of accompanying ablation
studies are big obstacle to identify the factors that drive the reported FSIC
performance. We study these details across three key dimensions: (1) Encoding
architectures: Cross-Encoder vs Bi-Encoders; (2) Similarity function:
Parameterized (i.e., trainable) functions vs non-parameterized function; (3)
Training regimes: Episodic meta-learning vs the straightforward (i.e.,
non-episodic) training. Our experimental results on seven FSIC benchmarks
reveal three important findings. First, the unexplored combination of the
cross-encoder architecture (with parameterized similarity scoring function) and
episodic meta-learning consistently yields the best FSIC performance. Second,
Episodic training yields a more robust FSIC classifier than non-episodic one.
Third, in meta-learning methods, splitting an episode to support and query sets
is not a must. Our findings paves the way for conducting state-of-the-art
research in FSIC and more importantly raise the community's attention to
details of FSIC methods. We release our code and data publicly.
Related papers
- Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - E3CM: Epipolar-Constrained Cascade Correspondence Matching [19.650006628979355]
We introduce Epipolar-Constrained Cascade Correspondence (E3CM) as a novel explicit programming-based method.
Unlike traditional methods, E3CM leverages pre-trained convolutional neural networks to match correspondence.
We extensively evaluate the performance of E3CM through comprehensive experiments and demonstrate its superiority over existing methods.
arXiv Detail & Related papers (2023-08-31T08:46:12Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - SimCLF: A Simple Contrastive Learning Framework for Function-level
Binary Embeddings [2.1222884030559315]
We propose SimCLF: A Simple Contrastive Learning Framework for Function-level Binary Embeddings.
We take an unsupervised learning approach and formulate binary code similarity detection as instance discrimination.
SimCLF directly operates on disassembled binary functions and could be implemented with any encoder.
arXiv Detail & Related papers (2022-09-06T12:09:45Z) - Learning Phone Recognition from Unpaired Audio and Phone Sequences Based
on Generative Adversarial Network [58.82343017711883]
This paper investigates how to learn directly from unpaired phone sequences and speech utterances.
GAN training is adopted in the first stage to find the mapping relationship between unpaired speech and phone sequence.
In the second stage, another HMM model is introduced to train from the generator's output, which boosts the performance.
arXiv Detail & Related papers (2022-07-29T09:29:28Z) - Pushing the Limits of Simple Pipelines for Few-Shot Learning: External
Data and Fine-Tuning Make a Difference [74.80730361332711]
Few-shot learning is an important and topical problem in computer vision.
We show that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-15T02:55:58Z) - A Closer Look at Few-Shot Video Classification: A New Baseline and
Benchmark [33.86872697028233]
We present an in-depth study on few-shot video classification by making three contributions.
First, we perform a consistent comparative study on the existing metric-based methods to figure out their limitations in representation learning.
Second, we discover that there is a high correlation between the novel action class and the ImageNet object class, which is problematic in the few-shot recognition setting.
Third, we present a new benchmark with more base data to facilitate future few-shot video classification without pre-training.
arXiv Detail & Related papers (2021-10-24T06:01:46Z) - Joint Inductive and Transductive Learning for Video Object Segmentation [107.32760625159301]
Semi-supervised object segmentation is a task of segmenting the target object in a video sequence given only a mask in the first frame.
Most previous best-performing methods adopt matching-based transductive reasoning or online inductive learning.
We propose to integrate transductive and inductive learning into a unified framework to exploit complement between them for accurate and robust video object segmentation.
arXiv Detail & Related papers (2021-08-08T16:25:48Z) - Sharing Matters for Generalization in Deep Metric Learning [22.243744691711452]
This work investigates how to learn characteristics that separate between classes without the need for annotations or training data.
By formulating our approach as a novel triplet sampling strategy, it can be easily applied on top of recent ranking loss frameworks.
arXiv Detail & Related papers (2020-04-12T10:21:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.