Semi-Supervised Few-Shot Intent Classification and Slot Filling
- URL: http://arxiv.org/abs/2109.08754v1
- Date: Fri, 17 Sep 2021 20:26:23 GMT
- Title: Semi-Supervised Few-Shot Intent Classification and Slot Filling
- Authors: Samyadeep Basu, Karine lp Kiun Chong, Amr Sharaf, Alex Fischer, Vishal
Rohra, Michael Amoake, Hazem El-Hammamy, Ehi Nosakhare, Vijay Ramani,
Benjamin Han
- Abstract summary: Intent classification (IC) and slot filling (SF) are two fundamental tasks in modern Natural Language Understanding (NLU) systems.
In this work, we investigate how contrastive learning and unsupervised data augmentation methods can benefit these existing supervised meta-learning pipelines.
- Score: 3.602651625446309
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Intent classification (IC) and slot filling (SF) are two fundamental tasks in
modern Natural Language Understanding (NLU) systems. Collecting and annotating
large amounts of data to train deep learning models for such systems is not
scalable. This problem can be addressed by learning from few examples using
fast supervised meta-learning techniques such as prototypical networks. In this
work, we systematically investigate how contrastive learning and unsupervised
data augmentation methods can benefit these existing supervised meta-learning
pipelines for jointly modelled IC/SF tasks. Through extensive experiments
across standard IC/SF benchmarks (SNIPS and ATIS), we show that our proposed
semi-supervised approaches outperform standard supervised meta-learning
methods: contrastive losses in conjunction with prototypical networks
consistently outperform the existing state-of-the-art for both IC and SF tasks,
while data augmentation strategies primarily improve few-shot IC by a
significant margin.
Related papers
- Interactive Continual Learning: Fast and Slow Thinking [19.253164551254734]
This paper presents a novel Interactive Continual Learning framework, enabled by collaborative interactions among models of various sizes.
To improve memory retrieval in System1, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution.
Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods.
arXiv Detail & Related papers (2024-03-05T03:37:28Z) - Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning [52.6706505729803]
We introduce Federated Learning (FL) to collaboratively train a decentralized shared model of Intrusion Detection Systems (IDS)
FLEKD enables a more flexible aggregation method than conventional model fusion techniques.
Experiment results show that the proposed approach outperforms local training and traditional FL in terms of both speed and performance.
arXiv Detail & Related papers (2024-01-22T14:16:37Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - NetSentry: A Deep Learning Approach to Detecting Incipient Large-scale
Network Attacks [9.194664029847019]
We show how to use Machine Learning for Network Intrusion Detection (NID) in a principled way.
We propose NetSentry, perhaps the first of its kind NIDS that builds on Bi-ALSTM, an original ensemble of sequential neural models.
We demonstrate F1 score gains above 33% over the state-of-the-art, as well as up to 3 times higher rates of detecting attacks such as XSS and web bruteforce.
arXiv Detail & Related papers (2022-02-20T17:41:02Z) - Hybridization of Capsule and LSTM Networks for unsupervised anomaly
detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network.
The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z) - Robust Self-Ensembling Network for Hyperspectral Image Classification [38.84831094095329]
We propose a robust self-ensembling network (RSEN) to address this problem.
The proposed RSEN consists of twoworks including a base network and an ensemble network.
We show that the proposed algorithm can yield competitive performance compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-08T13:33:14Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning [72.30054522048553]
We present a new method, Transductive Multi-Head Few-Shot learning (TMHFS), to address the Cross-Domain Few-Shot Learning challenge.
The proposed methods greatly outperform the strong baseline, fine-tuning, on four different target domains.
arXiv Detail & Related papers (2020-06-08T02:39:59Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Learning to Classify Intents and Slot Labels Given a Handful of Examples [22.783338548129983]
Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems.
We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios.
We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark.
arXiv Detail & Related papers (2020-04-22T18:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.