LLM-Human Pipeline for Cultural Context Grounding of Conversations
- URL: http://arxiv.org/abs/2410.13727v1
- Date: Thu, 17 Oct 2024 16:33:01 GMT
- Title: LLM-Human Pipeline for Cultural Context Grounding of Conversations
- Authors: Rajkumar Pujari, Dan Goldwasser,
- Abstract summary: Adherence or violation of social norms often dictates the tenor of conversations.
We introduce a "Cultural Context" for conversations.
We generate 110k social norm and violation descriptions for 23k conversations from Chinese culture.
- Score: 22.016345507132808
- License:
- Abstract: Conversations often adhere to well-understood social norms that vary across cultures. For example, while "addressing parents by name" is commonplace in the West, it is rare in most Asian cultures. Adherence or violation of such norms often dictates the tenor of conversations. Humans are able to navigate social situations requiring cultural awareness quite adeptly. However, it is a hard task for NLP models. In this paper, we tackle this problem by introducing a "Cultural Context Schema" for conversations. It comprises (1) conversational information such as emotions, dialogue acts, etc., and (2) cultural information such as social norms, violations, etc. We generate ~110k social norm and violation descriptions for ~23k conversations from Chinese culture using LLMs. We refine them using automated verification strategies which are evaluated against culturally aware human judgements. We organize these descriptions into meaningful structures we call "Norm Concepts", using an interactive human-in-loop framework. We ground the norm concepts and the descriptions in conversations using symbolic annotation. Finally, we use the obtained dataset for downstream tasks such as emotion, sentiment, and dialogue act detection. We show that it significantly improves the empirical performance.
Related papers
- Scalable Frame-based Construction of Sociocultural NormBases for Socially-Aware Dialogues [66.69453609603875]
Sociocultural norms serve as guiding principles for personal conduct in social interactions.
We propose a scalable approach for constructing a Sociocultural Norm (SCN) Base using Large Language Models (LLMs)
We construct a comprehensive and publicly accessible Chinese Sociocultural NormBase.
arXiv Detail & Related papers (2024-10-04T00:08:46Z) - Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT [4.798444680860121]
We compare responses generated by ChatGPT with those provided by native Hausa speakers on 37 culturally relevant questions.
Our results show that ChatGPT has some level of similarity to human responses, but also exhibits some gaps and biases in its knowledge and awareness of the Hausa culture and emotions.
arXiv Detail & Related papers (2024-06-27T19:42:13Z) - NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling
Social Norm Adherence and Violation [18.605252945314724]
We present a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures.
Our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline.
arXiv Detail & Related papers (2023-10-23T04:38:34Z) - Sociocultural Norm Similarities and Differences via Situational
Alignment and Explainable Textual Entailment [31.929550141633218]
We propose a novel approach to discover and compare social norms across Chinese and American cultures.
We build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures.
To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment.
arXiv Detail & Related papers (2023-05-23T19:43:47Z) - SODA: Million-scale Dialogue Distillation with Social Commonsense
Contextualization [129.1927527781751]
We present SODA, the first publicly available, million-scale high-quality social dialogue dataset.
By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions.
Human evaluation shows that conversations in SODA are more consistent, specific, and (surprisingly) natural than those in prior human-authored datasets.
arXiv Detail & Related papers (2022-12-20T17:38:47Z) - NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations
On-the-Fly [61.77957329364812]
We introduce a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery.
NormSAGE elicits knowledge about norms through directed questions representing the norm discovery task and conversation context.
It further addresses the risk of language model hallucination with a self-verification mechanism ensuring that the norms discovered are correct.
arXiv Detail & Related papers (2022-10-16T18:30:05Z) - In conversation with Artificial Intelligence: aligning language models
with human values [4.56877715768796]
Large-scale language technologies are increasingly used in various forms of communication with humans across different contexts.
One particular use case for these technologies is conversational agents, which output natural language text in response to prompts and queries.
This mode of engagement raises a number of social and ethical questions.
We propose a number of steps that help answer these questions.
arXiv Detail & Related papers (2022-09-01T21:16:47Z) - CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset
for Conversational AI [48.67259855309959]
Most existing datasets for conversational AI ignore human personalities and emotions.
We propose CPED, a large-scale Chinese personalized and emotional dialogue dataset.
CPED contains more than 12K dialogues of 392 speakers from 40 TV shows.
arXiv Detail & Related papers (2022-05-29T17:45:12Z) - Deception detection in text and its relation to the cultural dimension
of individualism/collectivism [6.17866386107486]
We investigate if differences in the usage of specific linguistic features of deception across cultures can be confirmed and attributed to norms in respect to the individualism/collectivism divide.
We create culture/language-aware classifiers by experimenting with a wide range of n-gram features based on phonology, morphology and syntax.
We conducted our experiments over 11 datasets from 5 languages i.e., English, Dutch, Russian, Spanish and Romanian, from six countries (US, Belgium, India, Russia, Mexico and Romania)
arXiv Detail & Related papers (2021-05-26T13:09:47Z) - Social Chemistry 101: Learning to Reason about Social and Moral Norms [73.23298385380636]
We present Social Chemistry, a new conceptual formalism to study people's everyday social norms and moral judgments.
Social-Chem-101 is a large-scale corpus that catalogs 292k rules-of-thumb.
Our model framework, Neural Norm Transformer, learns and generalizes Social-Chem-101 to successfully reason about previously unseen situations.
arXiv Detail & Related papers (2020-11-01T20:16:45Z) - I love your chain mail! Making knights smile in a fantasy game world:
Open-domain goal-oriented dialogue agents [69.68400056148336]
We train a goal-oriented model with reinforcement learning against an imitation-learned chit-chat'' model.
We show that both models outperform an inverse model baseline and can converse naturally with their dialogue partner in order to achieve goals.
arXiv Detail & Related papers (2020-02-07T16:22:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.