Look to the Right: Mitigating Relative Position Bias in Extractive
Question Answering
- URL: http://arxiv.org/abs/2210.14541v1
- Date: Wed, 26 Oct 2022 08:01:38 GMT
- Title: Look to the Right: Mitigating Relative Position Bias in Extractive
Question Answering
- Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa
- Abstract summary: Extractive question answering (QA) models tend to exploit spurious correlations to make predictions.
Relative position of an answer can be exploited by QA models as superficial cues for making predictions.
We propose an ensemble-based debiasing method that does not require prior knowledge about the distribution of relative positions.
- Score: 38.36299280464046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extractive question answering (QA) models tend to exploit spurious
correlations to make predictions when a training set has unintended biases.
This tendency results in models not being generalizable to examples where the
correlations do not hold. Determining the spurious correlations QA models can
exploit is crucial in building generalizable QA models in real-world
applications; moreover, a method needs to be developed that prevents these
models from learning the spurious correlations even when a training set is
biased. In this study, we discovered that the relative position of an answer,
which is defined as the relative distance from an answer span to the closest
question-context overlap word, can be exploited by QA models as superficial
cues for making predictions. Specifically, we find that when the relative
positions in a training set are biased, the performance on examples with
relative positions unseen during training is significantly degraded. To
mitigate the performance degradation for unseen relative positions, we propose
an ensemble-based debiasing method that does not require prior knowledge about
the distribution of relative positions. We demonstrate that the proposed method
mitigates the models' reliance on relative positions using the biased and full
SQuAD dataset. We hope that this study can help enhance the generalization
ability of QA models in real-world applications.
Related papers
- Mitigating Spurious Correlations via Disagreement Probability [4.8884049398279705]
Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes.
We introduce a training objective designed to robustly enhance model performance across all data samples.
We then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels.
arXiv Detail & Related papers (2024-11-04T02:44:04Z) - Towards Robust Text Classification: Mitigating Spurious Correlations with Causal Learning [2.7813683000222653]
We propose the Causally Calibrated Robust ( CCR) to reduce models' reliance on spurious correlations.
CCR integrates a causal feature selection method based on counterfactual reasoning, along with an inverse propensity weighting (IPW) loss function.
We show that CCR state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels.
arXiv Detail & Related papers (2024-11-01T21:29:07Z) - Eliminating Position Bias of Language Models: A Mechanistic Approach [119.34143323054143]
Position bias has proven to be a prevalent issue of modern language models (LMs)
Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings.
By eliminating position bias, models achieve better performance and reliability in downstream tasks, including LM-as-a-judge, retrieval-augmented QA, molecule generation, and math reasoning.
arXiv Detail & Related papers (2024-07-01T09:06:57Z) - Mitigating Bias for Question Answering Models by Tracking Bias Influence [84.66462028537475]
We propose BMBI, an approach to mitigate the bias of multiple-choice QA models.
Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance.
We show that our method could be applied to multiple QA formulations across multiple bias categories.
arXiv Detail & Related papers (2023-10-13T00:49:09Z) - Think Twice: Measuring the Efficiency of Eliminating Prediction
Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature.
We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA)
We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z) - Realistic Conversational Question Answering with Answer Selection based
on Calibrated Confidence and Uncertainty Measurement [54.55643652781891]
Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times.
We propose to filter out inaccurate answers in the conversation history based on their estimated confidences and uncertainties from the ConvQA model.
We validate our models, Answer Selection-based realistic Conversation Question Answering, on two standard ConvQA datasets.
arXiv Detail & Related papers (2023-02-10T09:42:07Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - Beyond Marginal Uncertainty: How Accurately can Bayesian Regression
Models Estimate Posterior Predictive Correlations? [13.127549105535623]
It is often more useful to estimate predictive correlations between the function values at different input locations.
We first consider a downstream task which depends on posterior predictive correlations: transductive active learning (TAL)
Since TAL is too expensive and indirect to guide development of algorithms, we introduce two metrics which more directly evaluate the predictive correlations.
arXiv Detail & Related papers (2020-11-06T03:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.