Related papers: Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification

Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification

URL: http://arxiv.org/abs/2504.14212v1
Date: Sat, 19 Apr 2025 07:36:02 GMT
Title: Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Authors: Takuma Udagawa, Yang Zhao, Hiroshi Kanayama, Bishwaranjan Bhattacharjee,
Abstract summary: We propose an efficient yet effective annotation pipeline to investigate social biases in the pretraining corpora.<n>Our pipeline consists of protected attribute detection to identify diverse demographics, followed by regard classification to analyze the language polarity towards each attribute.
Score: 13.267017023273919
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) acquire general linguistic knowledge from massive-scale pretraining. However, pretraining data mainly comprised of web-crawled texts contain undesirable social biases which can be perpetuated or even amplified by LLMs. In this study, we propose an efficient yet effective annotation pipeline to investigate social biases in the pretraining corpora. Our pipeline consists of protected attribute detection to identify diverse demographics, followed by regard classification to analyze the language polarity towards each attribute. Through our experiments, we demonstrate the effect of our bias analysis and mitigation measures, focusing on Common Crawl as the most representative pretraining corpus.

Related papers

From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling [3.304341919932024]
We introduce a novel large language model (LLM)-based approach that leverages domain-defining statements.<n>Our method generates interpretable natural language user profiles, condensing extensive user data into a scale.<n> Experimental results show our method significantly outperforms state-of-the-art LLM-based and traditional methods by 9.8%.
arXiv Detail & Related papers (2025-05-09T16:51:24Z)
Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis [20.503153899462323]
We propose a framework for semi-supervised sentiment analysis.<n>We introduce two prompting strategies to semantically enhance unlabeled text.<n> Experiments show our method achieves remarkable performance over prior semi-supervised methods.
arXiv Detail & Related papers (2025-01-29T12:03:11Z)
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.<n>We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z)
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective [66.34066553400108]
We conduct a rigorous evaluation of large language models' implicit bias towards certain demographics.<n>Inspired by psychometric principles, we propose three attack approaches, i.e., Disguise, Deception, and Teaching.<n>Our methods can elicit LLMs' inner bias more effectively than competitive baselines.
arXiv Detail & Related papers (2024-06-20T06:42:08Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios. We find that the efficacy of such learning paradigms varies with the type of OOD. Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring [13.817385516193445]
Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features. Deep neural networks are commonly trained to map fluency-related features into the human scores. We introduce a self-supervised learning (SSL) approach that takes into account phonetic and prosody awareness for fluency scoring.
arXiv Detail & Related papers (2023-05-19T05:39:41Z)
Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models [3.5278693565908137]
"Affective Bias" is biased association of emotions towards a particular gender, race, and religion. We show the existence of statistically significant affective bias in the PLM based emotion detection systems.
arXiv Detail & Related papers (2023-01-21T20:23:09Z)
Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis. We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z)
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis [20.026835809227283]
We introduce "typicality", a new formulation of evaluation rooted in information theory. We show how these decomposed dimensions of semantics and fluency provide greater system-level insight into captioner differences. Our proposed metrics along with their combination, SMURF, achieve state-of-the-art correlation with human judgment when compared with other rule-based evaluation metrics.
arXiv Detail & Related papers (2021-06-02T19:58:20Z)
Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags. Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z)
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis. We learn sentiment, aspect> joint topic embeddings in the word embedding space. We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.