Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence
Reasoning
- URL: http://arxiv.org/abs/2303.05670v1
- Date: Fri, 10 Mar 2023 02:52:13 GMT
- Title: Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence
Reasoning
- Authors: Hongyin Luo, James Glass
- Abstract summary: We describe several kinds of stereotypes concerning different communities that are present in popular sentence representation models.
By comparing strong pretrained models based on text similarity with textual entailment learning, we conclude that the explicit logic learning with textual entailment can significantly reduce bias.
- Score: 8.990338162517086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to their similarity-based learning objectives, pretrained sentence
encoders often internalize stereotypical assumptions that reflect the social
biases that exist within their training corpora. In this paper, we describe
several kinds of stereotypes concerning different communities that are present
in popular sentence representation models, including pretrained next sentence
prediction and contrastive sentence representation models. We compare such
models to textual entailment models that learn language logic for a variety of
downstream language understanding tasks. By comparing strong pretrained models
based on text similarity with textual entailment learning, we conclude that the
explicit logic learning with textual entailment can significantly reduce bias
and improve the recognition of social communities, without an explicit
de-biasing process
Related papers
- Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.
We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - Anti-stereotypical Predictive Text Suggestions Do Not Reliably Yield Anti-stereotypical Writing [28.26615961488287]
We consider how "debiasing" a language model impacts stories that people write using that language model in a predictive text scenario.
We find that, in certain scenarios, language model suggestions that align with common social stereotypes are more likely to be accepted by human authors.
arXiv Detail & Related papers (2024-09-30T15:21:25Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Debiasing Multimodal Sarcasm Detection with Contrastive Learning [5.43710908542843]
We propose a novel debiasing multimodal sarcasm detection framework with contrastive learning.
In particular, we first design counterfactual data augmentation to construct the positive samples with dissimilar word biases.
We devise an adapted debiasing contrastive learning mechanism to empower the model to learn robust task-relevant features.
arXiv Detail & Related papers (2023-12-16T16:14:50Z) - UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations [62.71847873326847]
We investigate the ability to model unusual, unexpected, and unlikely situations.
Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate an explanation.
We release a new English language corpus called UNcommonsense.
arXiv Detail & Related papers (2023-11-14T19:00:55Z) - On The Role of Reasoning in the Identification of Subtle Stereotypes in Natural Language [0.03749861135832073]
Large language models (LLMs) are trained on vast, uncurated datasets that contain various forms of biases and language reinforcing harmful stereotypes.
It is essential to examine and address biases in language models, integrating fairness into their development to ensure that these models do not perpetuate social biases.
This work firmly establishes reasoning as a critical component in automatic stereotype detection and is a first step towards stronger stereotype mitigation pipelines for LLMs.
arXiv Detail & Related papers (2023-07-24T15:12:13Z) - Evaluating Biased Attitude Associations of Language Models in an
Intersectional Context [2.891314299138311]
Language models are trained on large-scale corpora that embed implicit biases documented in psychology.
We study biases related to age, education, gender, height, intelligence, literacy, race, religion, sex, sexual orientation, social class, and weight.
We find that language models exhibit the most biased attitudes against gender identity, social class, and sexual orientation signals in language.
arXiv Detail & Related papers (2023-07-07T03:01:56Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Fair Hate Speech Detection through Evaluation of Social Group
Counterfactuals [21.375422346539004]
Approaches for mitigating bias in supervised models are designed to reduce models' dependence on specific sensitive features of the input data.
In the case of hate speech detection, it is not always desirable to equalize the effects of social groups.
Counterfactual token fairness for a mentioned social group evaluates the model's predictions as to whether they are the same for (a) the actual sentence and (b) a counterfactual instance.
Our approach assures robust model predictions for counterfactuals that imply similar meaning as the actual sentence.
arXiv Detail & Related papers (2020-10-24T04:51:47Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.