CHBias: Bias Evaluation and Mitigation of Chinese Conversational
Language Models
- URL: http://arxiv.org/abs/2305.11262v1
- Date: Thu, 18 May 2023 18:58:30 GMT
- Title: CHBias: Bias Evaluation and Mitigation of Chinese Conversational
Language Models
- Authors: Jiaxu Zhao, Meng Fang, Zijing Shi, Yitong Li, Ling Chen, Mykola
Pechenizkiy
- Abstract summary: We introduce a new Chinese dataset, CHBias, for bias evaluation and mitigation of Chinese conversational language models.
We evaluate two popular pretrained Chinese conversational models, CDial-GPT and EVA2.0, using CHBias.
- Score: 30.400023506841503
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: \textit{\textbf{\textcolor{red}{Warning}:} This paper contains content that
may be offensive or upsetting.} Pretrained conversational agents have been
exposed to safety issues, exhibiting a range of stereotypical human biases such
as gender bias. However, there are still limited bias categories in current
research, and most of them only focus on English. In this paper, we introduce a
new Chinese dataset, CHBias, for bias evaluation and mitigation of Chinese
conversational language models. Apart from those previous well-explored bias
categories, CHBias includes under-explored bias categories, such as ageism and
appearance biases, which received less attention. We evaluate two popular
pretrained Chinese conversational models, CDial-GPT and EVA2.0, using CHBias.
Furthermore, to mitigate different biases, we apply several debiasing methods
to the Chinese pretrained models. Experimental results show that these Chinese
pretrained models are potentially risky for generating texts that contain
social biases, and debiasing methods using the proposed dataset can make
response generation less biased while preserving the models' conversational
capabilities.
Related papers
- Bias Beyond English: Evaluating Social Bias and Debiasing Methods in a Low-Resource Setting [8.478711218359532]
Social bias in language models can potentially exacerbate social inequalities.
This study aims to leverage high-resource language corpora to evaluate bias and experiment with debiasing methods in low-resource languages.
arXiv Detail & Related papers (2025-04-15T13:40:22Z) - GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Current Topological and Machine Learning Applications for Bias Detection
in Text [4.799066966918178]
This study utilizes the RedditBias database to analyze textual biases.
Four transformer models, including BERT and RoBERTa variants, were explored.
Findings suggest BERT, particularly mini BERT, excels in bias classification, while multilingual models lag.
arXiv Detail & Related papers (2023-11-22T16:12:42Z) - CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI
Collaboration for Large Language Models [52.25049362267279]
We present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models.
The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control.
Extensive experiments demonstrate the effectiveness of the dataset in detecting model bias, with all 10 publicly available Chinese large language models exhibiting strong bias in certain categories.
arXiv Detail & Related papers (2023-06-28T14:14:44Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - An Analysis of Social Biases Present in BERT Variants Across Multiple
Languages [0.0]
We investigate the bias present in monolingual BERT models across a diverse set of languages.
We propose a template-based method to measure any kind of bias, based on sentence pseudo-likelihood.
We conclude that current methods of probing for bias are highly language-dependent.
arXiv Detail & Related papers (2022-11-25T23:38:08Z) - The World of an Octopus: How Reporting Bias Influences a Language
Model's Perception of Color [73.70233477125781]
We show that reporting bias negatively impacts and inherently limits text-only training.
We then demonstrate that multimodal models can leverage their visual training to mitigate these effects.
arXiv Detail & Related papers (2021-10-15T16:28:17Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of
Conversational Language Models [37.98671828283487]
Text representation models are prone to exhibit a range of societal biases.
Recent work has predominantly focused on measuring and mitigating bias in pretrained language models.
We present RedditBias, the first conversational data set grounded in the actual human conversations from Reddit.
arXiv Detail & Related papers (2021-06-07T11:22:39Z) - The Authors Matter: Understanding and Mitigating Implicit Bias in Deep
Text Classification [36.361778457307636]
Deep text classification models can produce biased outcomes for texts written by authors of certain demographic groups.
In this paper, we first demonstrate that implicit bias exists in different text classification tasks for different demographic groups.
We then build a learning-based interpretation method to deepen our knowledge of implicit bias.
arXiv Detail & Related papers (2021-05-06T16:17:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.