Enhancing Few-shot NER with Prompt Ordering based Data Augmentation
- URL: http://arxiv.org/abs/2305.11791v1
- Date: Fri, 19 May 2023 16:25:43 GMT
- Title: Enhancing Few-shot NER with Prompt Ordering based Data Augmentation
- Authors: Huiming Wang, Liying Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing
- Abstract summary: We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks.
Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
- Score: 59.69108119752584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, data augmentation (DA) methods have been proven to be effective for
pre-trained language models (PLMs) in low-resource settings, including few-shot
named entity recognition (NER). However, conventional NER DA methods are mostly
aimed at sequence labeling models, i.e., token-level classification, and few
are compatible with unified autoregressive generation frameworks, which can
handle a wider range of NER tasks, such as nested NER. Furthermore, these
generation frameworks have a strong assumption that the entities will appear in
the target sequence with the same left-to-right order as the source sequence.
In this paper, we claim that there is no need to keep this strict order, and
more diversified but reasonable target entity sequences can be provided during
the training stage as a novel DA method. Nevertheless, a naive mixture of
augmented data can confuse the model since one source sequence will then be
paired with different target sequences. Therefore, we propose a simple but
effective Prompt Ordering based Data Augmentation (PODA) method to improve the
training of unified autoregressive generation frameworks under few-shot NER
scenarios. Experimental results on three public NER datasets and further
analyses demonstrate the effectiveness of our approach.
Related papers
- Improving Out-of-Distribution Robustness of Classifiers via Generative
Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data.
However, their performance deteriorates significantly when handling out-of-distribution (OoD) data.
We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z) - Sequential Ensembling for Semantic Segmentation [4.030520171276982]
We benchmark the popular ensembling approach of combining predictions of multiple, independently-trained, state-of-the-art models.
We propose a novel method inspired by boosting to sequentially ensemble networks that significantly outperforms the naive ensemble baseline.
arXiv Detail & Related papers (2022-10-08T22:13:59Z) - Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A
Pilot Study on Named Entity Recognition [32.92597650149752]
We propose a novel few-shot fine-tuning framework for NER, FFF-NER.
Specifically, we introduce three new types of tokens, "is-entity", "which-type" and bracket, so we can formulate the NER fine-tuning as (masked) token prediction or generation.
We observe significant improvements over existing fine-tuning strategies, including sequence labeling, prototype meta-learning, and prompt-based approaches.
arXiv Detail & Related papers (2022-05-24T05:36:13Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - An EM Approach to Non-autoregressive Conditional Sequence Generation [49.11858479436565]
Autoregressive (AR) models have been the dominating approach to conditional sequence generation.
Non-autoregressive (NAR) models have been recently proposed to reduce the latency by generating all output tokens in parallel.
This paper proposes a new approach that jointly optimize both AR and NAR models in a unified Expectation-Maximization framework.
arXiv Detail & Related papers (2020-06-29T20:58:57Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.