Technical Report: Impact of Position Bias on Language Models in Token Classification
- URL: http://arxiv.org/abs/2304.13567v4
- Date: Thu, 11 Apr 2024 08:10:11 GMT
- Title: Technical Report: Impact of Position Bias on Language Models in Token Classification
- Authors: Mehdi Ben Amor, Michael Granitzer, Jelena Mitrović,
- Abstract summary: Downstream tasks such as Named Entity Recognition (NER) or Part-of-Speech (POS) tagging are known to suffer from data imbalance issues.
This paper investigates an often-overlooked issue of encoder models, specifically the position bias of positive examples in token classification tasks.
We show that LMs can suffer from this bias with an average drop ranging from 3% to 9% in their performance.
- Score: 0.6372911857214884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language Models (LMs) have shown state-of-the-art performance in Natural Language Processing (NLP) tasks. Downstream tasks such as Named Entity Recognition (NER) or Part-of-Speech (POS) tagging are known to suffer from data imbalance issues, particularly regarding the ratio of positive to negative examples and class disparities. This paper investigates an often-overlooked issue of encoder models, specifically the position bias of positive examples in token classification tasks. For completeness, we also include decoders in the evaluation. We evaluate the impact of position bias using different position embedding techniques, focusing on BERT with Absolute Position Embedding (APE), Relative Position Embedding (RPE), and Rotary Position Embedding (RoPE). Therefore, we conduct an in-depth evaluation of the impact of position bias on the performance of LMs when fine-tuned on token classification benchmarks. Our study includes CoNLL03 and OntoNote5.0 for NER, English Tree Bank UD\_en, and TweeBank for POS tagging. We propose an evaluation approach to investigate position bias in transformer models. We show that LMs can suffer from this bias with an average drop ranging from 3\% to 9\% in their performance. To mitigate this effect, we propose two methods: Random Position Shifting and Context Perturbation, that we apply on batches during the training process. The results show an improvement of $\approx$ 2\% in the performance of the model on CoNLL03, UD\_en, and TweeBank.
Related papers
- Eliminating Position Bias of Language Models: A Mechanistic Approach [119.34143323054143]
Position bias has proven to be a prevalent issue of modern language models (LMs)
Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings.
By eliminating position bias, models achieve better performance and reliability in downstream tasks, including LM-as-a-judge, retrieval-augmented QA, molecule generation, and math reasoning.
arXiv Detail & Related papers (2024-07-01T09:06:57Z) - On the Noise Robustness of In-Context Learning for Text Generation [41.59602454113563]
In this work, we show that, on text generation tasks, noisy annotations significantly hurt the performance of in-context learning.
To circumvent the issue, we propose a simple and effective approach called Local Perplexity Ranking (LPR)
LPR replaces the "noisy" candidates with their nearest neighbors that are more likely to be clean.
arXiv Detail & Related papers (2024-05-27T15:22:58Z) - InfFeed: Influence Functions as a Feedback to Improve the Performance of
Subjective Tasks [10.124267937114611]
In this paper, we introduce InfFeed, which uses influence functions to compute the influential instances for a target instance.
In doing this, InfFeed outperforms the state-of-the-art baselines by a maximum macro F1-score margin of almost 4% for hate speech classification.
We also show that manually re-annotating only those silver annotated data points in the extension set that have a negative influence can immensely improve the model performance.
arXiv Detail & Related papers (2024-02-22T16:59:09Z) - The Curious Case of Absolute Position Embeddings [65.13827063579728]
Transformer language models encode the notion of word order using positional information.
In natural language, it is not absolute position that matters, but relative position, and the extent to which APEs can capture this type of information has not been investigated.
We observe that models trained with APE over-rely on positional information to the point that they break-down when subjected to sentences with shifted position information.
arXiv Detail & Related papers (2022-10-23T00:00:04Z) - A Novel Dataset for Evaluating and Alleviating Domain Shift for Human
Detection in Agricultural Fields [59.035813796601055]
We evaluate the impact of domain shift on human detection models trained on well known object detection datasets when deployed on data outside the distribution of the training set.
We introduce the OpenDR Humans in Field dataset, collected in the context of agricultural robotics applications, using the Robotti platform.
arXiv Detail & Related papers (2022-09-27T07:04:28Z) - Binary Classification with Positive Labeling Sources [71.37692084951355]
We propose WEAPO, a simple yet competitive WS method for producing training labels without negative labeling sources.
We show WEAPO achieves the highest averaged performance on 10 benchmark datasets.
arXiv Detail & Related papers (2022-08-02T19:32:08Z) - Meta-Mining Discriminative Samples for Kinship Verification [95.26341773545528]
Kinship verification databases are born with unbalanced data.
We propose a Discriminative Sample Meta-Mining (DSMM) approach in this paper.
Experimental results on the widely used KinFaceW-I, KinFaceW-II, TSKinFace, and Cornell Kinship datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2021-03-28T11:47:07Z) - Mitigating the Position Bias of Transformer Models in Passage Re-Ranking [12.526786110360622]
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset.
We observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking.
We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset.
arXiv Detail & Related papers (2021-01-18T10:38:03Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z) - Meta-Learning for One-Class Classification with Few Examples using
Order-Equivariant Network [1.08890978642722]
This paper presents a framework for few-shots One-Class Classification (OCC) at test-time.
We consider that we have a set of one-class classification' objective-tasks with only a small set of positive examples available for each task.
We propose an approach using order-equivariant networks to learn a'meta' binary-classifier.
arXiv Detail & Related papers (2020-07-08T22:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.