Related papers: Knowledge Guided Metric Learning for Few-Shot Text Classification

Knowledge Guided Metric Learning for Few-Shot Text Classification

URL: http://arxiv.org/abs/2004.01907v1
Date: Sat, 4 Apr 2020 10:56:26 GMT
Title: Knowledge Guided Metric Learning for Few-Shot Text Classification
Authors: Dianbo Sui, Yubo Chen, Binjie Mao, Delai Qiu, Kang Liu and Jun Zhao
Abstract summary: We propose to introduce external knowledge into few-shot learning to imitate human knowledge. Inspired by human intelligence, we propose to introduce external knowledge into few-shot learning to imitate human knowledge. We demonstrate that our method outperforms the state-of-the-art few-shot text classification models.
Score: 22.832467388279873
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The training of deep-learning-based text classification models relies heavily on a huge amount of annotation data, which is difficult to obtain. When the labeled data is scarce, models tend to struggle to achieve satisfactory performance. However, human beings can distinguish new categories very efficiently with few examples. This is mainly due to the fact that human beings can leverage knowledge obtained from relevant tasks. Inspired by human intelligence, we propose to introduce external knowledge into few-shot learning to imitate human knowledge. A novel parameter generator network is investigated to this end, which is able to use the external knowledge to generate relation network parameters. Metrics can be transferred among tasks when equipped with these generated parameters, so that similar tasks use similar metrics while different tasks use different metrics. Through experiments, we demonstrate that our method outperforms the state-of-the-art few-shot text classification models.

Related papers

Using External knowledge to Enhanced PLM for Semantic Matching [38.125341836302525]
In this paper, we use external knowledge to enhance the pre-trained semantic relevance discrimination model.<n> Experimental results on 10 public datasets show that our method achieves consistent improvements in performance.
arXiv Detail & Related papers (2025-05-10T11:33:48Z)
In-context Learning in Presence of Spurious Correlations [8.055478206164105]
We study the possibility of training an in-context learner for classification tasks involving spurious features. We find that the conventional approach of training in-context learners is susceptible to spurious features. We propose a novel technique to train such a learner for a given classification task.
arXiv Detail & Related papers (2024-10-04T04:26:36Z)
CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition [16.987008461171065]
We explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition. Our method consists in adding intermediate layers called adapters for each task, and efficiently distilling knowledge from the previous model while learning the current task. We attain state-of-the-art performance on English, Italian and Russian scripts, whilst adding only a few parameters per task.
arXiv Detail & Related papers (2023-03-16T14:27:45Z)
Unsupervised Neural Stylistic Text Generation using Transfer learning and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation. We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z)
Active Learning of Ordinal Embeddings: A User Study on Football Data [4.856635699699126]
Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. This work uses deep metric learning to learn these user-defined similarity functions from few annotations for a large football trajectory dataset.
arXiv Detail & Related papers (2022-07-26T07:55:23Z)
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models. We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
Self-Supervised Visual Representation Learning Using Lightweight Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. We critically examine the most notable pretext tasks to extract features from image data. We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z)
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation. Our study analyzes the most critical challenges when learning from offline human data. We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z)
Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts. We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data. We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z)
Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks. We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks. We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z)
Learning from Few Samples: A Survey [1.4146420810689422]
We study the existing few-shot meta-learning techniques in the computer vision domain based on their method and evaluation metrics. We provide a taxonomy for the techniques and categorize them as data-augmentation, embedding, optimization and semantics based learning for few-shot, one-shot and zero-shot settings.
arXiv Detail & Related papers (2020-07-30T14:28:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.