Comparing Humans and Models on a Similar Scale: Towards Cognitive Gender
Bias Evaluation in Coreference Resolution
- URL: http://arxiv.org/abs/2305.15389v1
- Date: Wed, 24 May 2023 17:51:44 GMT
- Title: Comparing Humans and Models on a Similar Scale: Towards Cognitive Gender
Bias Evaluation in Coreference Resolution
- Authors: Gili Lior and Gabriel Stanovsky
- Abstract summary: Can we quantify the extent to which model biases reflect human behaviour?
We make several observations from two crowdsourcing experiments of gender bias in coreference resolution.
On real-world data humans make $sim$3% more gender-biased decisions compared to models, while on synthetic data models are $sim$12% more biased.
- Score: 11.711298780873468
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Spurious correlations were found to be an important factor explaining model
performance in various NLP tasks (e.g., gender or racial artifacts), often
considered to be ''shortcuts'' to the actual task. However, humans tend to
similarly make quick (and sometimes wrong) predictions based on societal and
cognitive presuppositions. In this work we address the question: can we
quantify the extent to which model biases reflect human behaviour? Answering
this question will help shed light on model performance and provide meaningful
comparisons against humans. We approach this question through the lens of the
dual-process theory for human decision-making. This theory differentiates
between an automatic unconscious (and sometimes biased) ''fast system'' and a
''slow system'', which when triggered may revisit earlier automatic reactions.
We make several observations from two crowdsourcing experiments of gender bias
in coreference resolution, using self-paced reading to study the ''fast''
system, and question answering to study the ''slow'' system under a constrained
time setting. On real-world data humans make $\sim$3\% more gender-biased
decisions compared to models, while on synthetic data models are $\sim$12\%
more biased.
Related papers
- Longer Fixations, More Computation: Gaze-Guided Recurrent Neural
Networks [12.57650361978445]
Humans read texts at a varying pace, while machine learning models treat each token in the same way.
In this paper, we convert this intuition into a set of novel models with fixation-guided parallel RNNs or layers.
We find that, interestingly, the fixation duration predicted by neural networks bears some resemblance to humans' fixation.
arXiv Detail & Related papers (2023-10-31T21:32:11Z) - Towards Understanding Sycophancy in Language Models [49.99654432561934]
We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback.
We show that five state-of-the-art AI assistants consistently exhibit sycophancy across four varied free-form text-generation tasks.
Our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
arXiv Detail & Related papers (2023-10-20T14:46:48Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Less Likely Brainstorming: Using Language Models to Generate Alternative
Hypotheses [45.720065723998225]
We introduce a new task, "less likely brainstorming," that asks a model to generate outputs that humans think are relevant but less likely to happen.
We find that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time.
We propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans.
arXiv Detail & Related papers (2023-05-30T18:05:34Z) - Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases [62.54519787811138]
We present a simple but effective method to measure and mitigate model biases caused by reliance on spurious cues.
We rank images within their classes based on spuriosity, proxied via deep neural features of an interpretable network.
Our results suggest that model bias due to spurious feature reliance is influenced far more by what the model is trained on than how it is trained.
arXiv Detail & Related papers (2022-12-05T23:15:43Z) - Does Debiasing Inevitably Degrade the Model Performance [8.20550078248207]
We propose a theoretical framework explaining the three candidate mechanisms of the language model's gender bias.
We also discover a pathway through which debiasing will not degrade the model performance.
arXiv Detail & Related papers (2022-11-14T13:46:13Z) - Learning Sample Importance for Cross-Scenario Video Temporal Grounding [30.82619216537177]
The paper investigates some superficial biases specific to the temporal grounding task.
We propose a novel method called Debiased Temporal Language Localizer (DebiasTLL) to prevent the model from naively memorizing the biases.
We evaluate the proposed model in cross-scenario temporal grounding, where the train / test data are heterogeneously sourced.
arXiv Detail & Related papers (2022-01-08T15:41:38Z) - Indecision Modeling [50.00689136829134]
It is important that AI systems act in ways which align with human values.
People are often indecisive, and especially so when their decision has moral implications.
arXiv Detail & Related papers (2020-12-15T18:32:37Z) - UnQovering Stereotyping Biases via Underspecified Questions [68.81749777034409]
We present UNQOVER, a framework to probe and quantify biases through underspecified questions.
We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors.
We use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion.
arXiv Detail & Related papers (2020-10-06T01:49:52Z) - Are Visual Explanations Useful? A Case Study in Model-in-the-Loop
Prediction [49.254162397086006]
We study explanations based on visual saliency in an image-based age prediction task.
We find that presenting model predictions improves human accuracy.
However, explanations of various kinds fail to significantly alter human accuracy or trust in the model.
arXiv Detail & Related papers (2020-07-23T20:39:40Z) - PsychFM: Predicting your next gamble [0.0]
Most of the human behavior itself can be modeled into a choice prediction problem.
Since the behavior is person dependent, there is a need to build a model that predicts choices on a per-person basis.
A novel hybrid model namely psychological factorisation machine (PsychFM) has been proposed that involves concepts from machine learning as well as psychological theories.
arXiv Detail & Related papers (2020-07-03T17:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.