Related papers: GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes

GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes

URL: http://arxiv.org/abs/2410.08388v2
Date: Thu, 17 Oct 2024 20:33:28 GMT
Title: GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes
Authors: Maximus Powers, Umang Mavani, Harshitha Reddy Jonala, Ansh Tiwari, Hua Wei,
Abstract summary: This paper introduces GUS-Net, an innovative approach to bias detection. GUS-Net focuses on three key types of biases: (G)eneralizations, (U)nfairness, and (S)tereotypes. Our methodology enhances traditional bias detection methods by incorporating the contextual encodings of pre-trained models.
Score: 2.2162879952427343
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The detection of bias in natural language processing (NLP) is a critical challenge, particularly with the increasing use of large language models (LLMs) in various domains. This paper introduces GUS-Net, an innovative approach to bias detection that focuses on three key types of biases: (G)eneralizations, (U)nfairness, and (S)tereotypes. GUS-Net leverages generative AI and automated agents to create a comprehensive synthetic dataset, enabling robust multi-label token classification. Our methodology enhances traditional bias detection methods by incorporating the contextual encodings of pre-trained models, resulting in improved accuracy and depth in identifying biased entities. Through extensive experiments, we demonstrate that GUS-Net outperforms state-of-the-art techniques, achieving superior performance in terms of accuracy, F1-score, and Hamming Loss. The findings highlight GUS-Net's effectiveness in capturing a wide range of biases across diverse contexts, making it a valuable tool for social bias detection in text. This study contributes to the ongoing efforts in NLP to address implicit bias, providing a pathway for future research and applications in various fields. The Jupyter notebooks used to create the dataset and model are available at: https://github.com/Ethical-Spectacle/fair-ly/tree/main/resources. Warning: This paper contains examples of harmful language, and reader discretion is recommended.

Related papers

Multimodal Approaches to Fair Image Classification: An Ethical Perspective [0.0]
This thesis explores the intersection of technology and ethics in the development of fair image classification models. I focus on improving fairness and methods of using multiple modalities to combat harmful demographic bias. The study critically examines existing biases in image datasets and classification algorithms, proposes innovative methods for mitigating these biases, and evaluates the ethical implications of deploying such systems in real-world scenarios.
arXiv Detail & Related papers (2024-12-11T19:58:31Z)
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [68.62012304574012]
multimodal generative models have sparked critical discussions on their reliability, fairness and potential for misuse.<n>We propose an evaluation framework to assess model reliability by analyzing responses to global and local perturbations in the embedding space.<n>Our method lays the groundwork for detecting unreliable, bias-injected models and tracing the provenance of embedded biases.
arXiv Detail & Related papers (2024-11-21T09:46:55Z)
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation [3.328297368052458]
We tackle bias detection in medical curricula using NLP models, including LLMs. We evaluate them on a gold standard dataset containing 4,105 excerpts annotated by medical experts for bias from a large corpus.
arXiv Detail & Related papers (2024-09-11T17:10:20Z)
Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings [5.257719744958367]
This thesis explores three challenging settings in text classification by leveraging the intrinsic knowledge of pretrained language models (PLMs) We develop models that utilize features based on contextualized word representations from PLMs, achieving performance that rivals or surpasses human accuracy. Lastly, we tackle the sensitivity of large language models to in-context learning prompts by selecting effective demonstrations.
arXiv Detail & Related papers (2024-08-28T09:07:30Z)
Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases. FAST surpasses state-of-the-art baselines with superior debiasing performance. This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
Generative Multi-modal Models are Good Class-Incremental Learners [51.5648732517187]
We propose a novel generative multi-modal model (GMM) framework for class-incremental learning. Our approach directly generates labels for images using an adapted generative model. Under the Few-shot CIL setting, we have improved by at least 14% accuracy over all the current state-of-the-art methods with significantly less forgetting.
arXiv Detail & Related papers (2024-03-27T09:21:07Z)
Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning. We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL) Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z)
Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning [36.739829839357995]
This study introduces a novel methodology named bias-kNN'' This approach capitalizes on the biased outputs, harnessing them as primary features for kNN and supplementing with gold labels. Our comprehensive evaluations, spanning diverse domain text classification datasets and different GPT-2 model sizes, indicate the adaptability and efficacy of the bias-kNN'' method.
arXiv Detail & Related papers (2024-01-18T08:05:45Z)
Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment. We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections. We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z)
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP) We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z)
Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs) Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations. Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
NBIAS: A Natural Language Processing Framework for Bias Identification in Text [9.486702261615166]
Bias in textual data can lead to skewed interpretations and outcomes when the data is used. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. We develop a comprehensive framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer.
arXiv Detail & Related papers (2023-08-03T10:48:30Z)
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models [52.25049362267279]
We present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models. The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control. Extensive experiments demonstrate the effectiveness of the dataset in detecting model bias, with all 10 publicly available Chinese large language models exhibiting strong bias in certain categories.
arXiv Detail & Related papers (2023-06-28T14:14:44Z)
Soft-prompt Tuning for Large Language Models to Evaluate Bias [0.03141085922386211]
Using soft-prompts to evaluate bias gives us the extra advantage of avoiding the human-bias injection. We check the model biases on different sensitive attributes using the group fairness (bias) and find interesting bias patterns.
arXiv Detail & Related papers (2023-06-07T19:11:25Z)
DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z)
Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z)
Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias [2.6304695993930594]
We present a survey to comprehend bias in large pre-trained language models, analyze the stages at which they occur, and various ways in which these biases could be quantified and mitigated. Considering wide applicability of textual affective computing based downstream tasks in real-world systems such as business, healthcare, education, etc., we give a special emphasis on investigating bias in the context of affect (emotion) i.e., Affective Bias. We present a summary of various bias evaluation corpora that help to aid future research and discuss challenges in the research on bias in pre-trained language models.
arXiv Detail & Related papers (2022-04-21T18:51:19Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.