Transformer-based Named Entity Recognition with Combined Data Representation
- URL: http://arxiv.org/abs/2406.17474v1
- Date: Tue, 25 Jun 2024 11:41:16 GMT
- Title: Transformer-based Named Entity Recognition with Combined Data Representation
- Authors: Michał Marcińczuk,
- Abstract summary: The study investigates data representation strategies, including single, merged, and context, which respectively use one sentence, multiple sentences, and sentences joined with attention to context per vector.
Analysis shows that training models with a single strategy may lead to poor performance on different data representations.
To address this limitation, the study proposes a combined training procedure that utilizes all three strategies to improve model stability and adaptability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study examines transformer-based models and their effectiveness in named entity recognition tasks. The study investigates data representation strategies, including single, merged, and context, which respectively use one sentence, multiple sentences, and sentences joined with attention to context per vector. Analysis shows that training models with a single strategy may lead to poor performance on different data representations. To address this limitation, the study proposes a combined training procedure that utilizes all three strategies to improve model stability and adaptability. The results of this approach are presented and discussed for four languages (English, Polish, Czech, and German) across various datasets, demonstrating the effectiveness of the combined strategy.
Related papers
- Investigating the Impact of Data Selection Strategies on Language Model Performance [1.0013553984400492]
This study explores the effects of different data selection methods and feature types on model performance.
We evaluate whether selecting data subsets can influence downstream tasks, whether n-gram features improve alignment with target distributions, and whether embedding-based neural features provide complementary benefits.
arXiv Detail & Related papers (2025-01-07T14:38:49Z) - Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment [0.23020018305241333]
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.
The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies.
arXiv Detail & Related papers (2024-07-01T20:25:20Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Exploiting Image Translations via Ensemble Self-Supervised Learning for
Unsupervised Domain Adaptation [0.0]
We introduce an unsupervised domain adaption (UDA) strategy that combines multiple image translations, ensemble learning and self-supervised learning in one coherent approach.
We focus on one of the standard tasks of UDA in which a semantic segmentation model is trained on labeled synthetic data together with unlabeled real-world data.
arXiv Detail & Related papers (2021-07-13T16:43:02Z) - Federated Learning under Importance Sampling [49.17137296715029]
We study the effect of importance sampling and devise schemes for sampling agents and data non-uniformly guided by a performance measure.
We find that in schemes involving sampling without replacement, the performance of the resulting architecture is controlled by two factors related to data variability at each agent.
arXiv Detail & Related papers (2020-12-14T10:08:55Z) - Adversarial Augmentation Policy Search for Domain and Cross-Lingual
Generalization in Reading Comprehension [96.62963688510035]
Reading comprehension models often overfit to nuances of training datasets and fail at adversarial evaluation.
We present several effective adversaries and automated data augmentation policy search methods with the goal of making reading comprehension models more robust to adversarial evaluation.
arXiv Detail & Related papers (2020-04-13T17:20:08Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.