A Simple Meta-learning Paradigm for Zero-shot Intent Classification with
Mixture Attention Mechanism
- URL: http://arxiv.org/abs/2206.02179v1
- Date: Sun, 5 Jun 2022 13:37:51 GMT
- Title: A Simple Meta-learning Paradigm for Zero-shot Intent Classification with
Mixture Attention Mechanism
- Authors: Han Liu, Siyang Zhao, Xiaotong Zhang, Feng Zhang, Junjie Sun, Hong Yu,
Xianchao Zhang
- Abstract summary: We propose a simple yet effective meta-learning paradigm for zero-shot intent classification.
To learn better semantic representations for utterances, we introduce a new mixture attention mechanism.
To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy.
- Score: 17.228616743739412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot intent classification is a vital and challenging task in dialogue
systems, which aims to deal with numerous fast-emerging unacquainted intents
without annotated training data. To obtain more satisfactory performance, the
crucial points lie in two aspects: extracting better utterance features and
strengthening the model generalization ability. In this paper, we propose a
simple yet effective meta-learning paradigm for zero-shot intent
classification. To learn better semantic representations for utterances, we
introduce a new mixture attention mechanism, which encodes the pertinent word
occurrence patterns by leveraging the distributional signature attention and
multi-layer perceptron attention simultaneously. To strengthen the transfer
ability of the model from seen classes to unseen classes, we reformulate
zero-shot intent classification with a meta-learning strategy, which trains the
model by simulating multiple zero-shot classification tasks on seen categories,
and promotes the model generalization ability with a meta-adapting procedure on
mimic unseen categories. Extensive experiments on two real-world dialogue
datasets in different languages show that our model outperforms other strong
baselines on both standard and generalized zero-shot intent classification
tasks.
Related papers
- Open-Vocabulary Temporal Action Localization using Multimodal Guidance [67.09635853019005]
OVTAL enables a model to recognize any desired action category in videos without the need to explicitly curate training data for all categories.
This flexibility poses significant challenges, as the model must recognize not only the action categories seen during training but also novel categories specified at inference.
We introduce OVFormer, a novel open-vocabulary framework extending ActionFormer with three key contributions.
arXiv Detail & Related papers (2024-06-21T18:00:05Z) - An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Text2Model: Text-based Model Induction for Zero-shot Image
Classification [41.0122522912593]
We address the challenge of building task-agnostic classifiers using only text descriptions.
We train a hypernetwork that receives class descriptions and outputs a multi-class model.
We evaluate this approach in a series of zero-shot classification tasks, for image, point-cloud, and action recognition.
arXiv Detail & Related papers (2022-10-27T05:19:55Z) - DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning [37.48292304239107]
We present a transformer-based end-to-end ZSL method named DUET.
We develop a cross-modal semantic grounding network to investigate the model's capability of disentangling semantic attributes from the images.
We find that DUET can often achieve state-of-the-art performance, its components are effective and its predictions are interpretable.
arXiv Detail & Related papers (2022-07-04T11:12:12Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot
Action Recognition [33.23662792742078]
We propose a two-stage deep neural network for zero-shot action recognition.
In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes.
In the classification stage, we construct a knowledge graph based on the relationship between word vectors of action classes and related objects.
arXiv Detail & Related papers (2021-05-25T09:34:42Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning [79.25478727351604]
We explore a simple process: meta-learning over a whole-classification pre-trained model on its evaluation metric.
We observe this simple method achieves competitive performance to state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2020-03-09T20:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.