D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
- URL: http://arxiv.org/abs/2404.10857v1
- Date: Tue, 16 Apr 2024 19:12:03 GMT
- Title: D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
- Authors: Aida Mostafazadeh Davani, Mark Díaz, Dylan Baker, Vinodkumar Prabhakaran,
- Abstract summary: We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators.
The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity.
Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
- Score: 5.9053106775634685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within demographic groups may hold diverse values, which can influence their perceptions beyond their group norms. To effectively incorporate these considerations into NLP pipelines, we need datasets with extensive parallel annotations from various social and cultural groups. In this paper we introduce the \dataset dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries, representing eight geo-cultural regions. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values, offering crucial insights for building pluralistic, culturally sensitive NLP models.
Related papers
- Commonsense Reasoning in Arab Culture [6.116784716369165]
We introduce datasetname, a commonsense reasoning dataset in Modern Standard Arabic (MSA), covering cultures of 13 countries across the Gulf, Levant, North Africa, and the Nile Valley.
The dataset was built from scratch by engaging native speakers to write and validate culturally relevant questions for their respective countries.
datasetname spans 12 daily life domains with 54 fine-grained subtopics, reflecting various aspects of social norms, traditions, and everyday experiences.
arXiv Detail & Related papers (2025-02-18T11:49:54Z) - Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation [71.59208664920452]
Cultural biases in multilingual datasets pose significant challenges for their effectiveness as global benchmarks.
We show that progress on MMLU depends heavily on learning Western-centric concepts, with 28% of all questions requiring culturally sensitive knowledge.
We release Global MMLU, an improved MMLU with evaluation coverage across 42 languages.
arXiv Detail & Related papers (2024-12-04T13:27:09Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models [38.932610459192105]
We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs)
Our work underscores the value of using diverse data to create more inclusive multimodal systems.
arXiv Detail & Related papers (2024-05-22T16:04:22Z) - The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models [67.38144169029617]
We map the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 Large Language Models (LLMs)
With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts.
We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.
arXiv Detail & Related papers (2024-04-24T17:51:36Z) - Disentangling Perceptions of Offensiveness: Cultural and Moral
Correlates [4.857640117519813]
We argue that cultural and psychological factors play a vital role in the cognitive processing of offensiveness.
We demonstrate substantial cross-cultural differences in perceptions of offensiveness.
Individual moral values play a crucial role in shaping these variations.
arXiv Detail & Related papers (2023-12-11T22:12:20Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - The Ghost in the Machine has an American accent: value conflict in GPT-3 [0.0]
We discuss how the co-creation of language and cultural value impacts large language models.
We stress tested GPT-3 with a range of value-rich texts representing several languages and nations.
We observed when values embedded in the input text were mutated in the generated outputs.
arXiv Detail & Related papers (2022-03-15T11:06:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.