Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and
Benchmarks
- URL: http://arxiv.org/abs/2202.08011v1
- Date: Wed, 16 Feb 2022 11:59:29 GMT
- Title: Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and
Benchmarks
- Authors: Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie
Huang, Xin Jiang, Qun Liu, Helen Meng
- Abstract summary: In this paper, we focus our investigation on social bias detection of dialog safety problems.
We first propose a novel Dial-Bias Frame for analyzing the social bias in conversations pragmatically.
We introduce CDail-Bias dataset that is the first well-annotated Chinese social bias dialog dataset.
- Score: 95.29345070102045
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The research of open-domain dialog systems has been greatly prospered by
neural models trained on large-scale corpora, however, such corpora often
introduce various safety problems (e.g., offensive languages, biases, and toxic
behaviors) that significantly hinder the deployment of dialog systems in
practice. Among all these unsafe issues, addressing social bias is more complex
as its negative impact on marginalized populations is usually expressed
implicitly, thus requiring normative reasoning and rigorous analysis. In this
paper, we focus our investigation on social bias detection of dialog safety
problems. We first propose a novel Dial-Bias Frame for analyzing the social
bias in conversations pragmatically, which considers more comprehensive
bias-related analyses rather than simple dichotomy annotations. Based on the
proposed framework, we further introduce CDail-Bias Dataset that, to our
knowledge, is the first well-annotated Chinese social bias dialog dataset. In
addition, we establish several dialog bias detection benchmarks at different
label granularities and input types (utterance-level and context-level). We
show that the proposed in-depth analyses together with these benchmarks in our
Dial-Bias Frame are necessary and essential to bias detection tasks and can
benefit building safe dialog systems in practice.
Related papers
- 'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue [3.971267935825097]
This paper aims to improve the user experience that involves multi-round long dialogues by efficiently detecting OOD dialogues and images.
We introduce a novel scoring framework named Dialogue Image Aligning and Enhancing Framework (DIAEF) that integrates the visual language models with the novel proposed scores.
arXiv Detail & Related papers (2024-10-31T12:45:54Z) - Scalable Frame-based Construction of Sociocultural NormBases for Socially-Aware Dialogues [66.69453609603875]
Sociocultural norms serve as guiding principles for personal conduct in social interactions.
We propose a scalable approach for constructing a Sociocultural Norm (SCN) Base using Large Language Models (LLMs)
We construct a comprehensive and publicly accessible Chinese Sociocultural NormBase.
arXiv Detail & Related papers (2024-10-04T00:08:46Z) - Context Does Matter: Implications for Crowdsourced Evaluation Labels in Task-Oriented Dialogue Systems [57.16442740983528]
Crowdsourced labels play a crucial role in evaluating task-oriented dialogue systems.
Previous studies suggest using only a portion of the dialogue context in the annotation process.
This study investigates the influence of dialogue context on annotation quality.
arXiv Detail & Related papers (2024-04-15T17:56:39Z) - An Empirical Bayes Framework for Open-Domain Dialogue Generation [27.83533924583182]
We propose an empirical bayes framework for constructing an open-domain dialogue agent by leveraging pretrained parameters.
Empirical results show that BODEB achieves better results in terms of both diversity and coherence compared to variational frameworks.
arXiv Detail & Related papers (2023-11-18T02:48:41Z) - A Deeper (Autoregressive) Approach to Non-Convergent Discourse Parsing [0.6599344783327052]
We present a unified model for Non-Convergent Discourse Parsing that does not require any additional input other than the previous dialog utterances.
Our model achieves results comparable with SOTA, without using label collocation and without training a unique architecture/model for each label.
arXiv Detail & Related papers (2023-05-21T17:04:21Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Dialogue Inspectional Summarization with Factual Inconsistency Awareness [34.97845384948336]
We investigate the factual inconsistency problem for Dialogue Inspectional Summarization (DIS) under non-pretraining and pretraining settings.
An innovative end-to-end dialogue summary generation framework is proposed with two auxiliary tasks.
Comprehensive experiments demonstrate that the proposed model can generate a more readable summary with accurate coverage of factual aspects.
arXiv Detail & Related papers (2021-11-05T06:26:22Z) - On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark [42.322782754346406]
We propose a taxonomy for dialogue safety specifically designed to capture unsafe behaviors that are unique in human-bot dialogue setting.
We compile DiaSafety, a dataset of 6 unsafe categories with rich context-sensitive unsafe examples.
Experiments show that existing utterance-level safety tools guarding fail catastrophically on our dataset.
arXiv Detail & Related papers (2021-10-16T04:17:12Z) - I like fish, especially dolphins: Addressing Contradictions in Dialogue
Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.