Challenges in Annotating Datasets to Quantify Bias in Under-represented
Society
- URL: http://arxiv.org/abs/2309.08624v1
- Date: Mon, 11 Sep 2023 22:24:39 GMT
- Title: Challenges in Annotating Datasets to Quantify Bias in Under-represented
Society
- Authors: Vithya Yogarajan, Gillian Dobbie, Timothy Pistotti, Joshua Bensemann,
Kobe Knowles
- Abstract summary: Benchmark bias datasets have been developed for binary gender classification and ethical/racial considerations.
Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we created benchmark datasets for the New Zealand (NZ) population.
This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.
- Score: 7.9342597513806865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in artificial intelligence, including the development of
highly sophisticated large language models (LLM), have proven beneficial in
many real-world applications. However, evidence of inherent bias encoded in
these LLMs has raised concerns about equity. In response, there has been an
increase in research dealing with bias, including studies focusing on
quantifying bias and developing debiasing techniques. Benchmark bias datasets
have also been developed for binary gender classification and ethical/racial
considerations, focusing predominantly on American demographics. However, there
is minimal research in understanding and quantifying bias related to
under-represented societies. Motivated by the lack of annotated datasets for
quantifying bias in under-represented societies, we endeavoured to create
benchmark datasets for the New Zealand (NZ) population. We faced many
challenges in this process, despite the availability of three annotators. This
research outlines the manual annotation process, provides an overview of the
challenges we encountered and lessons learnt, and presents recommendations for
future research.
Related papers
- Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs [0.0]
Large Language Models (LLMs) are being adopted across a wide range of tasks.
Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations.
This study highlights that newer or larger language models do not automatically exhibit reduced bias.
arXiv Detail & Related papers (2024-10-13T03:43:18Z) - A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions [0.0]
Large Language Models (LLMs) have revolutionized various applications in natural language processing (NLP) by providing unprecedented text generation, translation, and comprehension capabilities.
Their widespread deployment has brought to light significant concerns regarding biases embedded within these models.
This paper presents a comprehensive survey of biases in LLMs, aiming to provide an extensive review of the types, sources, impacts, and mitigation strategies related to these biases.
arXiv Detail & Related papers (2024-09-24T19:50:38Z) - Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases.
LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs)
We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status)
We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - Tackling Bias in Pre-trained Language Models: Current Trends and
Under-represented Societies [6.831519625084861]
This research presents a survey synthesising the current trends and limitations in techniques used for identifying and mitigating bias in language models.
We argue that current practices tackling the bias problem cannot simply be 'plugged in' to address the needs of under-represented societies.
arXiv Detail & Related papers (2023-12-03T21:25:10Z) - Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL.
The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI
Collaboration for Large Language Models [52.25049362267279]
We present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models.
The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control.
Extensive experiments demonstrate the effectiveness of the dataset in detecting model bias, with all 10 publicly available Chinese large language models exhibiting strong bias in certain categories.
arXiv Detail & Related papers (2023-06-28T14:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.