NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling
Social Norm Adherence and Violation
- URL: http://arxiv.org/abs/2310.14563v2
- Date: Wed, 25 Oct 2023 02:00:19 GMT
- Title: NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling
Social Norm Adherence and Violation
- Authors: Oliver Li, Mallika Subramanian, Arkadiy Saakyan, Sky CH-Wang, Smaranda
Muresan
- Abstract summary: We present a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures.
Our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline.
- Score: 18.605252945314724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social norms fundamentally shape interpersonal communication. We present
NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations
of social norm adherences and violations for Chinese and American cultures.
Introducing the task of social norm observance detection, our dataset is
synthetically generated in both Chinese and English using a human-in-the-loop
pipeline by prompting large language models with a small collection of
expert-annotated social norms. We show that our generated dialogues are of high
quality through human evaluation and further evaluate the performance of
existing large language models on this task. Our findings point towards new
directions for understanding the nuances of social norms as they manifest in
conversational contexts that span across languages and cultures.
Related papers
- ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions [47.85181608392683]
We employ ValueScope to dissect and analyze linguistic and stylistic expressions across 13 Reddit communities.
Our analysis provides a quantitative foundation showing that even closely related communities exhibit remarkably diverse norms.
arXiv Detail & Related papers (2024-07-02T17:51:27Z) - Measuring Social Norms of Large Language Models [13.648679166997693]
We present a new challenge to examine whether large language models understand social norms.
Our dataset features the largest set of social norm skills, consisting of 402 skills and 12,383 questions.
We propose a multi-agent framework based on large language models to improve the models' ability to understand social norms.
arXiv Detail & Related papers (2024-04-03T05:58:57Z) - RENOVI: A Benchmark Towards Remediating Norm Violations in
Socio-Cultural Conversations [46.634702800643566]
ReNoVi is a large-scale corpus of 9,258 multi-turn dialogues annotated with social norms.
ReNoVi consists of two parts: 512 human-authored dialogues (real data), and 8,746 synthetic conversations generated by ChatGPT through prompt learning.
arXiv Detail & Related papers (2024-02-17T03:13:42Z) - Your spouse needs professional help: Determining the Contextual
Appropriateness of Messages through Modeling Social Relationships [7.415975372963896]
We introduce a new approach to identifying inappropriate communication by explicitly modeling the social relationship between the individuals.
We show that large language models can readily incorporate relationship information to accurately identify appropriateness in a given context.
We also demonstrate that contextual-appropriateness judgments are predictive of other social factors expressed in language such as condescension and politeness.
arXiv Detail & Related papers (2023-07-06T04:06:05Z) - Sociocultural Norm Similarities and Differences via Situational
Alignment and Explainable Textual Entailment [31.929550141633218]
We propose a novel approach to discover and compare social norms across Chinese and American cultures.
We build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures.
To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment.
arXiv Detail & Related papers (2023-05-23T19:43:47Z) - NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations
On-the-Fly [61.77957329364812]
We introduce a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery.
NormSAGE elicits knowledge about norms through directed questions representing the norm discovery task and conversation context.
It further addresses the risk of language model hallucination with a self-verification mechanism ensuring that the norms discovered are correct.
arXiv Detail & Related papers (2022-10-16T18:30:05Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Ethical-Advice Taker: Do Language Models Understand Natural Language
Interventions? [62.74872383104381]
We investigate the effectiveness of natural language interventions for reading-comprehension systems.
We propose a new language understanding task, Linguistic Ethical Interventions (LEI), where the goal is to amend a question-answering (QA) model's unethical behavior.
arXiv Detail & Related papers (2021-06-02T20:57:58Z) - On the Use of Linguistic Features for the Evaluation of Generative
Dialogue Systems [17.749995931459136]
We propose that a metric based on linguistic features may be able to maintain good correlation with human judgment and be interpretable.
To support this proposition, we measure and analyze various linguistic features on dialogues produced by multiple dialogue models.
We find that the features' behaviour is consistent with the known properties of the models tested, and is similar across domains.
arXiv Detail & Related papers (2021-04-13T16:28:00Z) - Can You be More Social? Injecting Politeness and Positivity into
Task-Oriented Conversational Agents [60.27066549589362]
Social language used by human agents is associated with greater users' responsiveness and task completion.
The model uses a sequence-to-sequence deep learning architecture, extended with a social language understanding element.
Evaluation in terms of content preservation and social language level using both human judgment and automatic linguistic measures shows that the model can generate responses that enable agents to address users' issues in a more socially appropriate way.
arXiv Detail & Related papers (2020-12-29T08:22:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.