Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis
- URL: http://arxiv.org/abs/2310.00347v3
- Date: Wed, 17 Apr 2024 11:48:11 GMT
- Title: Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis
- Authors: Shaina Raza, Oluwanifemi Bamgbose, Veronica Chatrath, Shardul Ghuge, Yan Sidyakin, Abdullah Y Muaad,
- Abstract summary: We present the Contextualized Bi-Directional Dual Transformer (CBDT) textcolorgreenfaLeaf classifier.
We have prepared a dataset specifically for training these models to identify and locate biases in texts.
Our evaluations across various datasets demonstrate CBDT textcolorgreen effectiveness in distinguishing biased narratives from neutral ones and identifying specific biased terms.
- Score: 1.8692054990918079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bias detection in text is crucial for combating the spread of negative stereotypes, misinformation, and biased decision-making. Traditional language models frequently face challenges in generalizing beyond their training data and are typically designed for a single task, often focusing on bias detection at the sentence level. To address this, we present the Contextualized Bi-Directional Dual Transformer (CBDT) \textcolor{green}{\faLeaf} classifier. This model combines two complementary transformer networks: the Context Transformer and the Entity Transformer, with a focus on improving bias detection capabilities. We have prepared a dataset specifically for training these models to identify and locate biases in texts. Our evaluations across various datasets demonstrate CBDT \textcolor{green} effectiveness in distinguishing biased narratives from neutral ones and identifying specific biased terms. This work paves the way for applying the CBDT \textcolor{green} model in various linguistic and cultural contexts, enhancing its utility in bias detection efforts. We also make the annotated dataset available for research purposes.
Related papers
- COBIAS: Contextual Reliability in Bias Assessment [14.594920595573038]
Large Language Models (LLMs) are trained on extensive web corpora, which enable them to understand and generate human-like text.
These biases arise from web data's diverse and often uncurated nature, containing various stereotypes and prejudices.
We propose understanding the context of inputs by considering the diverse situations in which they may arise.
arXiv Detail & Related papers (2024-02-22T10:46:11Z) - Current Topological and Machine Learning Applications for Bias Detection
in Text [4.799066966918178]
This study utilizes the RedditBias database to analyze textual biases.
Four transformer models, including BERT and RoBERTa variants, were explored.
Findings suggest BERT, particularly mini BERT, excels in bias classification, while multilingual models lag.
arXiv Detail & Related papers (2023-11-22T16:12:42Z) - NBIAS: A Natural Language Processing Framework for Bias Identification
in Text [9.486702261615166]
Bias in textual data can lead to skewed interpretations and outcomes when the data is used.
An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people.
We develop a comprehensive framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer.
arXiv Detail & Related papers (2023-08-03T10:48:30Z) - Transformer Language Models Handle Word Frequency in Prediction Head [31.145866381881625]
This study investigates the inner workings of the prediction head, specifically focusing on bias parameters.
Our experiments with BERT and GPT-2 models reveal that the biases in their word prediction heads play a significant role in the models' ability to reflect word frequency in a corpus.
arXiv Detail & Related papers (2023-05-29T17:59:15Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - A Domain-adaptive Pre-training Approach for Language Bias Detection in
News [3.7238620986236373]
We present DA-RoBERTa, a new state-of-the-art transformer-based model adapted to the media bias domain.
We also train, DA-BERT and DA-BART, two more transformer models adapted to the bias domain.
Our proposed domain-adapted models outperform prior bias detection approaches on the same data.
arXiv Detail & Related papers (2022-05-22T08:18:19Z) - Paragraph-based Transformer Pre-training for Multi-Sentence Inference [99.59693674455582]
We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.
arXiv Detail & Related papers (2022-05-02T21:41:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.