Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis
- URL: http://arxiv.org/abs/2310.00347v3
- Date: Wed, 17 Apr 2024 11:48:11 GMT
- Title: Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis
- Authors: Shaina Raza, Oluwanifemi Bamgbose, Veronica Chatrath, Shardul Ghuge, Yan Sidyakin, Abdullah Y Muaad,
- Abstract summary: We present the Contextualized Bi-Directional Dual Transformer (CBDT) textcolorgreenfaLeaf classifier.
We have prepared a dataset specifically for training these models to identify and locate biases in texts.
Our evaluations across various datasets demonstrate CBDT textcolorgreen effectiveness in distinguishing biased narratives from neutral ones and identifying specific biased terms.
- Score: 1.8692054990918079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bias detection in text is crucial for combating the spread of negative stereotypes, misinformation, and biased decision-making. Traditional language models frequently face challenges in generalizing beyond their training data and are typically designed for a single task, often focusing on bias detection at the sentence level. To address this, we present the Contextualized Bi-Directional Dual Transformer (CBDT) \textcolor{green}{\faLeaf} classifier. This model combines two complementary transformer networks: the Context Transformer and the Entity Transformer, with a focus on improving bias detection capabilities. We have prepared a dataset specifically for training these models to identify and locate biases in texts. Our evaluations across various datasets demonstrate CBDT \textcolor{green} effectiveness in distinguishing biased narratives from neutral ones and identifying specific biased terms. This work paves the way for applying the CBDT \textcolor{green} model in various linguistic and cultural contexts, enhancing its utility in bias detection efforts. We also make the annotated dataset available for research purposes.
Related papers
- GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - Current Topological and Machine Learning Applications for Bias Detection
in Text [4.799066966918178]
This study utilizes the RedditBias database to analyze textual biases.
Four transformer models, including BERT and RoBERTa variants, were explored.
Findings suggest BERT, particularly mini BERT, excels in bias classification, while multilingual models lag.
arXiv Detail & Related papers (2023-11-22T16:12:42Z) - NBIAS: A Natural Language Processing Framework for Bias Identification
in Text [9.486702261615166]
Bias in textual data can lead to skewed interpretations and outcomes when the data is used.
An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people.
We develop a comprehensive framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer.
arXiv Detail & Related papers (2023-08-03T10:48:30Z) - Transformer Language Models Handle Word Frequency in Prediction Head [31.145866381881625]
This study investigates the inner workings of the prediction head, specifically focusing on bias parameters.
Our experiments with BERT and GPT-2 models reveal that the biases in their word prediction heads play a significant role in the models' ability to reflect word frequency in a corpus.
arXiv Detail & Related papers (2023-05-29T17:59:15Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - A Domain-adaptive Pre-training Approach for Language Bias Detection in
News [3.7238620986236373]
We present DA-RoBERTa, a new state-of-the-art transformer-based model adapted to the media bias domain.
We also train, DA-BERT and DA-BART, two more transformer models adapted to the bias domain.
Our proposed domain-adapted models outperform prior bias detection approaches on the same data.
arXiv Detail & Related papers (2022-05-22T08:18:19Z) - Paragraph-based Transformer Pre-training for Multi-Sentence Inference [99.59693674455582]
We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.
arXiv Detail & Related papers (2022-05-02T21:41:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z) - Mitigating the Position Bias of Transformer Models in Passage Re-Ranking [12.526786110360622]
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset.
We observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking.
We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset.
arXiv Detail & Related papers (2021-01-18T10:38:03Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.