Discriminative and Generative Transformer-based Models For Situation
Entity Classification
- URL: http://arxiv.org/abs/2109.07434v1
- Date: Wed, 15 Sep 2021 17:07:07 GMT
- Title: Discriminative and Generative Transformer-based Models For Situation
Entity Classification
- Authors: Mehdi Rezaee, Kasra Darvish, Gaoussou Youssouf Kebe, Francis Ferraro
- Abstract summary: We re-examine the situation entity (SE) classification task with varying amounts of available training data.
We exploit a Transformer-based variational autoencoder to encode sentences into a lower dimensional latent space.
- Score: 8.029049649310211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We re-examine the situation entity (SE) classification task with varying
amounts of available training data. We exploit a Transformer-based variational
autoencoder to encode sentences into a lower dimensional latent space, which is
used to generate the text and learn a SE classifier. Test set and cross-genre
evaluations show that when training data is plentiful, the proposed model can
improve over the previous discriminative state-of-the-art models. Our approach
performs disproportionately better with smaller amounts of training data, but
when faced with extremely small sets (4 instances per label), generative RNN
methods outperform transformers. Our work provides guidance for future efforts
on SE and semantic prediction tasks, and low-label training regimes.
Related papers
- Practical token pruning for foundation models in few-shot conversational virtual assistant systems [6.986560111427867]
We pretrain a transformer-based sentence embedding model with a contrastive learning objective and leverage the embedding of the model as features when training intent classification models.
Our approach achieves the state-of-the-art results for few-shot scenarios and performs better than other commercial solutions on popular intent classification benchmarks.
arXiv Detail & Related papers (2024-08-21T17:42:17Z) - Meta-learning Pathologies from Radiology Reports using Variance Aware
Prototypical Networks [3.464871689508835]
We propose a simple extension of the Prototypical Networks for few-shot text classification.
Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids.
arXiv Detail & Related papers (2022-10-22T05:22:29Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time.
We show that adaptation on the scale of one to five examples is possible.
Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Few Is Enough: Task-Augmented Active Meta-Learning for Brain Cell
Classification [8.998976678920236]
We propose a tAsk-auGmented actIve meta-LEarning (AGILE) method to efficiently adapt Deep Neural Networks to new tasks.
AGILE combines a meta-learning algorithm with a novel task augmentation technique which we use to generate an initial adaptive model.
We show that the proposed task-augmented meta-learning framework can learn to classify new cell types after a single gradient step.
arXiv Detail & Related papers (2020-07-09T18:03:12Z) - T-VSE: Transformer-Based Visual Semantic Embedding [5.317624228510748]
We show that transformer-based cross-modal embeddings outperform word average and RNN-based embeddings by a large margin, when trained on a large dataset of e-commerce product image-title pairs.
arXiv Detail & Related papers (2020-05-17T23:40:33Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Gradient-Based Adversarial Training on Transformer Networks for
Detecting Check-Worthy Factual Claims [3.7543966923106438]
We introduce the first adversarially-regularized, transformer-based claim spotter model.
We obtain a 4.70 point F1-score improvement over current state-of-the-art models.
We propose a method to apply adversarial training to transformer models.
arXiv Detail & Related papers (2020-02-18T16:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.