Adaptive Prototypical Networks with Label Words and Joint Representation
Learning for Few-Shot Relation Classification
- URL: http://arxiv.org/abs/2101.03526v1
- Date: Sun, 10 Jan 2021 11:25:42 GMT
- Title: Adaptive Prototypical Networks with Label Words and Joint Representation
Learning for Few-Shot Relation Classification
- Authors: Yan Xiao, Yaochu Jin, and Kuangrong Hao
- Abstract summary: This work focuses on few-shot relation classification (FSRC)
We propose an adaptive mixture mechanism to add label words to the representation of the class prototype.
Experiments have been conducted on FewRel under different few-shot (FS) settings.
- Score: 17.237331828747006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Relation classification (RC) task is one of fundamental tasks of information
extraction, aiming to detect the relation information between entity pairs in
unstructured natural language text and generate structured data in the form of
entity-relation triple. Although distant supervision methods can effectively
alleviate the problem of lack of training data in supervised learning, they
also introduce noise into the data, and still cannot fundamentally solve the
long-tail distribution problem of the training instances. In order to enable
the neural network to learn new knowledge through few instances like humans,
this work focuses on few-shot relation classification (FSRC), where a
classifier should generalize to new classes that have not been seen in the
training set, given only a number of samples for each class. To make full use
of the existing information and get a better feature representation for each
instance, we propose to encode each class prototype in an adaptive way from two
aspects. First, based on the prototypical networks, we propose an adaptive
mixture mechanism to add label words to the representation of the class
prototype, which, to the best of our knowledge, is the first attempt to
integrate the label information into features of the support samples of each
class so as to get more interactive class prototypes. Second, to more
reasonably measure the distances between samples of each category, we introduce
a loss function for joint representation learning to encode each support
instance in an adaptive manner. Extensive experiments have been conducted on
FewRel under different few-shot (FS) settings, and the results show that the
proposed adaptive prototypical networks with label words and joint
representation learning has not only achieved significant improvements in
accuracy, but also increased the generalization ability of few-shot RC models.
Related papers
- Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification [11.072083437769093]
We propose a novel model named SharpReCL for imbalanced text classification tasks.
Our model even outperforms popular large language models across several datasets.
arXiv Detail & Related papers (2024-05-19T11:33:49Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot
Action Recognition [33.23662792742078]
We propose a two-stage deep neural network for zero-shot action recognition.
In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes.
In the classification stage, we construct a knowledge graph based on the relationship between word vectors of action classes and related objects.
arXiv Detail & Related papers (2021-05-25T09:34:42Z) - Recognition and Processing of NATOM [0.0]
This paper shows how to process the NOTAM (Notice to Airmen) data of the field in civil aviation.
For the original data of the NOTAM, there is a mixture of Chinese and English, and the structure is poor.
Using Glove word vector methods to represent the data for using a custom mapping vocabulary.
arXiv Detail & Related papers (2021-04-29T10:12:00Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Adaptive Name Entity Recognition under Highly Unbalanced Data [5.575448433529451]
We present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks.
We introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set.
arXiv Detail & Related papers (2020-03-10T06:56:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.