Related papers: Challenges in Annotating Datasets to Quantify Bias in Under-represented Society

Challenges in Annotating Datasets to Quantify Bias in Under-represented Society

URL: http://arxiv.org/abs/2309.08624v1
Date: Mon, 11 Sep 2023 22:24:39 GMT
Title: Challenges in Annotating Datasets to Quantify Bias in Under-represented Society
Authors: Vithya Yogarajan, Gillian Dobbie, Timothy Pistotti, Joshua Bensemann, Kobe Knowles
Abstract summary: Benchmark bias datasets have been developed for binary gender classification and ethical/racial considerations. Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we created benchmark datasets for the New Zealand (NZ) population. This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.
Score: 7.9342597513806865
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in artificial intelligence, including the development of highly sophisticated large language models (LLM), have proven beneficial in many real-world applications. However, evidence of inherent bias encoded in these LLMs has raised concerns about equity. In response, there has been an increase in research dealing with bias, including studies focusing on quantifying bias and developing debiasing techniques. Benchmark bias datasets have also been developed for binary gender classification and ethical/racial considerations, focusing predominantly on American demographics. However, there is minimal research in understanding and quantifying bias related to under-represented societies. Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we endeavoured to create benchmark datasets for the New Zealand (NZ) population. We faced many challenges in this process, despite the availability of three annotators. This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.

Related papers

Biased Heritage: How Datasets Shape Models in Facial Expression Recognition [13.77824359359967]
We study bias propagation from datasets to trained models in image-based Facial Expression Recognition systems. We introduce new bias metrics specifically designed for multiclass problems with multiple demographic groups. Our findings suggest that preventing emotion-specific demographic patterns should be prioritized over general demographic balance in FER datasets.
arXiv Detail & Related papers (2025-03-05T12:25:22Z)
How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs [0.0]
Large Language Models (LLMs) are being adopted across a wide range of tasks. Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations. This study highlights that newer or larger language models do not automatically exhibit reduced bias.
arXiv Detail & Related papers (2024-10-13T03:43:18Z)
A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions [0.0]
Large Language Models (LLMs) have revolutionized various applications in natural language processing (NLP) by providing unprecedented text generation, translation, and comprehension capabilities. Their widespread deployment has brought to light significant concerns regarding biases embedded within these models. This paper presents a comprehensive survey of biases in LLMs, aiming to provide an extensive review of the types, sources, impacts, and mitigation strategies related to these biases.
arXiv Detail & Related papers (2024-09-24T19:50:38Z)
Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases. LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs) We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status) We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z)
GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community. The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability. We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z)
Tackling Bias in Pre-trained Language Models: Current Trends and Under-represented Societies [6.831519625084861]
This research presents a survey synthesising the current trends and limitations in techniques used for identifying and mitigating bias in language models. We argue that current practices tackling the bias problem cannot simply be 'plugged in' to address the needs of under-represented societies.
arXiv Detail & Related papers (2023-12-03T21:25:10Z)
Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL. The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models [52.25049362267279]
We present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models. The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control. Extensive experiments demonstrate the effectiveness of the dataset in detecting model bias, with all 10 publicly available Chinese large language models exhibiting strong bias in certain categories.
arXiv Detail & Related papers (2023-06-28T14:14:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.