Related papers: Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark

Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark

URL: http://arxiv.org/abs/2501.01349v1
Date: Thu, 02 Jan 2025 17:01:06 GMT
Title: Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark
Authors: Liang He, Yougang Chu, Zhen Wu, Jianbing Zhang, Xinyu Dai, Jiajun Chen,
Abstract summary: Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions.<n>This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context.<n>We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement.<n>To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
Score: 53.876493664396506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions. However, biases within datasets can lead models to learn shortcut patterns, resulting in inaccurate assessments and hindering real-world applicability. This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context. We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement. DREB utilizes Bias Evaluator and PPL Evaluator to ensure low bias and high naturalness, providing a reliable and accurate assessment of model generalization in entity bias scenarios. To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques. MixDebias effectively improves model performance on DREB while maintaining performance on the original dataset. Extensive experiments demonstrate the effectiveness and robustness of MixDebias compared to existing methods, highlighting its potential for improving the generalization ability of relation extraction models. We will release DREB and MixDebias publicly.

Related papers

Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders [13.474737752636608]
We present the largest comprehensive analysis to-date of how the upstream pre-training factors and downstream performance of CLIP models relate to intrinsic biases.<n>We study 131 unique CLIP models, trained on 26 datasets, using 55 architectures, and in a variety of sizes.<n>We find that the choice of pre-training dataset is the most significant upstream predictor of bias, whereas architectural variations have minimal impact.
arXiv Detail & Related papers (2025-02-11T21:11:47Z)
RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting [16.633948320306832]
biases prevalent in manually constructed datasets can introduce spurious correlations between tokens and labels.<n>Existing debiasing methods often rely on prior knowledge of specific dataset biases.<n>We propose RAZOR, a novel, unsupervised, and data-focused debiasing approach based on text rewriting for shortcut mitigation.
arXiv Detail & Related papers (2024-12-10T17:02:58Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
Rethinking Debiasing: Real-World Bias Analysis and Mitigation [17.080528126651977]
We revisit biased distributions in existing benchmarks and real-world datasets. We empirically and theoretically identify key characteristics of real-world biases poorly represented by existing benchmarks. We propose a simple yet effective approach that can be easily applied to existing debiasing methods, named Debias in Destruction (DiD)
arXiv Detail & Related papers (2024-05-24T06:06:41Z)
Improving Bias Mitigation through Bias Experts in Natural Language Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model. Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z)
Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets. We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias. DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Regularizing Models via Pointwise Mutual Information for Named Entity Recognition [17.767466724342064]
We propose a Pointwise Mutual Information (PMI) to enhance generalization ability while outperforming an in-domain performance. Our approach enables to debias highly correlated word and labels in the benchmark datasets. For long-named and complex-structure entities, our method can predict these entities through debiasing on conjunction or special characters.
arXiv Detail & Related papers (2021-04-15T05:47:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.