On Robustness and Bias Analysis of BERT-based Relation Extraction
- URL: http://arxiv.org/abs/2009.06206v5
- Date: Sat, 25 Dec 2021 08:21:08 GMT
- Title: On Robustness and Bias Analysis of BERT-based Relation Extraction
- Authors: Luoqiu Li, Xiang Chen, Hongbin Ye, Zhen Bi, Shumin Deng, Ningyu Zhang,
Huajun Chen
- Abstract summary: We analyze a fine-tuned BERT model from different perspectives using relation extraction.
We find that BERT suffers a bottleneck in terms of robustness by way of randomizations, adversarial and counterfactual tests, and biases.
- Score: 40.64969232497321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning pre-trained models have achieved impressive performance on
standard natural language processing benchmarks. However, the resultant model
generalizability remains poorly understood. We do not know, for example, how
excellent performance can lead to the perfection of generalization models. In
this study, we analyze a fine-tuned BERT model from different perspectives
using relation extraction. We also characterize the differences in
generalization techniques according to our proposed improvements. From
empirical experimentation, we find that BERT suffers a bottleneck in terms of
robustness by way of randomizations, adversarial and counterfactual tests, and
biases (i.e., selection and semantic). These findings highlight opportunities
for future improvements. Our open-sourced testbed DiagnoseRE is available in
\url{https://github.com/zjunlp/DiagnoseRE}.
Related papers
- On the Out of Distribution Robustness of Foundation Models in Medical
Image Segmentation [47.95611203419802]
Foundations for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach.
We compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset.
We further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution data.
arXiv Detail & Related papers (2023-11-18T14:52:10Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Guide the Learner: Controlling Product of Experts Debiasing Method Based
on Token Attribution Similarities [17.082695183953486]
A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model.
Here, the underlying assumption is that the biased model resorts to shortcut features.
We introduce a fine-tuning strategy that incorporates the similarity between the main and biased model attribution scores in a Product of Experts loss function.
arXiv Detail & Related papers (2023-02-06T15:21:41Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Posterior Differential Regularization with f-divergence for Improving
Model Robustness [95.05725916287376]
We focus on methods that regularize the model posterior difference between clean and noisy inputs.
We generalize the posterior differential regularization to the family of $f$-divergences.
Our experiments show that regularizing the posterior differential with $f$-divergence can result in well-improved model robustness.
arXiv Detail & Related papers (2020-10-23T19:58:01Z) - On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
Strong Baselines [31.807628937487927]
Fine-tuning pre-trained language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks.
Previous literature identified two potential reasons for the observed instability: catastrophic forgetting and small size of the fine-tuning datasets.
We show that both hypotheses fail to explain the fine-tuning instability.
arXiv Detail & Related papers (2020-06-08T19:06:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.