Look to the Right: Mitigating Relative Position Bias in Extractive
Question Answering
- URL: http://arxiv.org/abs/2210.14541v1
- Date: Wed, 26 Oct 2022 08:01:38 GMT
- Title: Look to the Right: Mitigating Relative Position Bias in Extractive
Question Answering
- Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa
- Abstract summary: Extractive question answering (QA) models tend to exploit spurious correlations to make predictions.
Relative position of an answer can be exploited by QA models as superficial cues for making predictions.
We propose an ensemble-based debiasing method that does not require prior knowledge about the distribution of relative positions.
- Score: 38.36299280464046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extractive question answering (QA) models tend to exploit spurious
correlations to make predictions when a training set has unintended biases.
This tendency results in models not being generalizable to examples where the
correlations do not hold. Determining the spurious correlations QA models can
exploit is crucial in building generalizable QA models in real-world
applications; moreover, a method needs to be developed that prevents these
models from learning the spurious correlations even when a training set is
biased. In this study, we discovered that the relative position of an answer,
which is defined as the relative distance from an answer span to the closest
question-context overlap word, can be exploited by QA models as superficial
cues for making predictions. Specifically, we find that when the relative
positions in a training set are biased, the performance on examples with
relative positions unseen during training is significantly degraded. To
mitigate the performance degradation for unseen relative positions, we propose
an ensemble-based debiasing method that does not require prior knowledge about
the distribution of relative positions. We demonstrate that the proposed method
mitigates the models' reliance on relative positions using the biased and full
SQuAD dataset. We hope that this study can help enhance the generalization
ability of QA models in real-world applications.
Related papers
- Eliminating Position Bias of Language Models: A Mechanistic Approach [119.34143323054143]
Position bias has proven to be a prevalent issue of modern language models (LMs)
We find that causal attention generally causes models to favor distant content, while relative positional encodings like RoPE prefer nearby ones.
We propose to ELIMINATE position bias caused by different input segment orders (e.g., options in LM-as-a-judge, retrieved documents in QA) in a TRAINING-FREE ZERO-SHOT manner.
arXiv Detail & Related papers (2024-07-01T09:06:57Z) - Mitigating Bias for Question Answering Models by Tracking Bias Influence [84.66462028537475]
We propose BMBI, an approach to mitigate the bias of multiple-choice QA models.
Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance.
We show that our method could be applied to multiple QA formulations across multiple bias categories.
arXiv Detail & Related papers (2023-10-13T00:49:09Z) - Think Twice: Measuring the Efficiency of Eliminating Prediction
Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature.
We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA)
We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z) - Realistic Conversational Question Answering with Answer Selection based
on Calibrated Confidence and Uncertainty Measurement [54.55643652781891]
Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times.
We propose to filter out inaccurate answers in the conversation history based on their estimated confidences and uncertainties from the ConvQA model.
We validate our models, Answer Selection-based realistic Conversation Question Answering, on two standard ConvQA datasets.
arXiv Detail & Related papers (2023-02-10T09:42:07Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - Beyond Marginal Uncertainty: How Accurately can Bayesian Regression
Models Estimate Posterior Predictive Correlations? [13.127549105535623]
It is often more useful to estimate predictive correlations between the function values at different input locations.
We first consider a downstream task which depends on posterior predictive correlations: transductive active learning (TAL)
Since TAL is too expensive and indirect to guide development of algorithms, we introduce two metrics which more directly evaluate the predictive correlations.
arXiv Detail & Related papers (2020-11-06T03:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.