A Comparative Study of Pre-trained Encoders for Low-Resource Named
Entity Recognition
- URL: http://arxiv.org/abs/2204.04980v1
- Date: Mon, 11 Apr 2022 09:48:26 GMT
- Title: A Comparative Study of Pre-trained Encoders for Low-Resource Named
Entity Recognition
- Authors: Yuxuan Chen and Jonas Mikkelsen and Arne Binder and Christoph Alt and
Leonhard Hennig
- Abstract summary: We introduce an encoder evaluation framework, and use it to compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER.
We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning.
- Score: 10.0731894715001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (PLM) are effective components of few-shot named
entity recognition (NER) approaches when augmented with continued pre-training
on task-specific out-of-domain data or fine-tuning on in-domain data. However,
their performance in low-resource scenarios, where such data is not available,
remains an open question. We introduce an encoder evaluation framework, and use
it to systematically compare the performance of state-of-the-art pre-trained
representations on the task of low-resource NER. We analyze a wide range of
encoders pre-trained with different strategies, model architectures,
intermediate-task fine-tuning, and contrastive learning. Our experimental
results across ten benchmark NER datasets in English and German show that
encoder performance varies significantly, suggesting that the choice of encoder
for a specific low-resource scenario needs to be carefully evaluated.
Related papers
- Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian [75.94354349994576]
This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in specialized contexts.
Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models.
The results indicate that while further pre-trained models may show diminished robustness in general knowledge, they exhibit superior adaptability for domain-specific tasks, even in a zero-shot setting.
arXiv Detail & Related papers (2024-07-30T08:50:16Z) - Improving a Named Entity Recognizer Trained on Noisy Data with a Few
Clean Instances [55.37242480995541]
We propose to denoise noisy NER data with guidance from a small set of clean instances.
Along with the main NER model we train a discriminator model and use its outputs to recalibrate the sample weights.
Results on public crowdsourcing and distant supervision datasets show that the proposed method can consistently improve performance with a small guidance set.
arXiv Detail & Related papers (2023-10-25T17:23:37Z) - Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation [7.056222499095849]
beam search seeks the transcript with the greatest likelihood computed using the predicted distribution.
We show that recently proposed Self-Supervised Learning (SSL)-based ASR models tend to yield exceptionally confident predictions.
We propose a decoding procedure that improves the performance of fine-tuned ASR models.
arXiv Detail & Related papers (2022-12-27T06:42:26Z) - Explaining Cross-Domain Recognition with Interpretable Deep Classifier [100.63114424262234]
Interpretable Deep (IDC) learns the nearest source samples of a target sample as evidence upon which the classifier makes the decision.
Our IDC leads to a more explainable model with almost no accuracy degradation and effectively calibrates classification for optimum reject options.
arXiv Detail & Related papers (2022-11-15T15:58:56Z) - SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning.
This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z) - NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging [40.57720568571513]
We construct a massive NER corpus with a relatively high quality, and we pre-train a NER-BERT model based on the created dataset.
Experimental results show that our pre-trained model can significantly outperform BERT as well as other strong baselines in low-resource scenarios.
arXiv Detail & Related papers (2021-12-01T10:45:02Z) - Learning from Language Description: Low-shot Named Entity Recognition
via Decomposed Framework [23.501276952950366]
We propose a novel NER framework, namely SpanNER, which learns from natural language supervision and enables the identification of never-seen entity classes.
We perform extensive experiments on 5 benchmark datasets and evaluate the proposed method in the few-shot learning, domain transfer and zero-shot learning settings.
The experimental results show that the proposed method can bring 10%, 23% and 26% improvements in average over the best baselines in few-shot learning, domain transfer and zero-shot learning settings respectively.
arXiv Detail & Related papers (2021-09-11T19:52:09Z) - Boosting the Generalization Capability in Cross-Domain Few-shot Learning
via Noise-enhanced Supervised Autoencoder [23.860842627883187]
We teach the model to capture broader variations of the feature distributions with a novel noise-enhanced supervised autoencoder (NSAE)
NSAE trains the model by jointly reconstructing inputs and predicting the labels of inputs as well as their reconstructed pairs.
We also take advantage of NSAE structure and propose a two-step fine-tuning procedure that achieves better adaption and improves classification performance in the target domain.
arXiv Detail & Related papers (2021-08-11T04:45:56Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.