D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
- URL: http://arxiv.org/abs/2404.10857v1
- Date: Tue, 16 Apr 2024 19:12:03 GMT
- Title: D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
- Authors: Aida Mostafazadeh Davani, Mark Díaz, Dylan Baker, Vinodkumar Prabhakaran,
- Abstract summary: We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators.
The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity.
Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
- Score: 5.9053106775634685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within demographic groups may hold diverse values, which can influence their perceptions beyond their group norms. To effectively incorporate these considerations into NLP pipelines, we need datasets with extensive parallel annotations from various social and cultural groups. In this paper we introduce the \dataset dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries, representing eight geo-cultural regions. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values, offering crucial insights for building pluralistic, culturally sensitive NLP models.
Related papers
- Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models [38.932610459192105]
We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs)
Our work underscores the value of using diverse data to create more inclusive multimodal systems.
arXiv Detail & Related papers (2024-05-22T16:04:22Z) - The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models [67.38144169029617]
We introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries.
PRISM contributes (i) wide geographic and demographic participation in human feedback data; (ii) two census-representative samples for understanding collective welfare (UK and US); and (iii) individualised feedback where every rating is linked to a detailed participant profile.
arXiv Detail & Related papers (2024-04-24T17:51:36Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - Disentangling Perceptions of Offensiveness: Cultural and Moral
Correlates [4.857640117519813]
We argue that cultural and psychological factors play a vital role in the cognitive processing of offensiveness.
We demonstrate substantial cross-cultural differences in perceptions of offensiveness.
Individual moral values play a crucial role in shaping these variations.
arXiv Detail & Related papers (2023-12-11T22:12:20Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - Probing Pre-Trained Language Models for Cross-Cultural Differences in
Values [42.45033681054207]
We introduce probes to study which values across cultures are embedded in Pre-Trained Language models.
We find that PTLMs capture differences in values across cultures, but those only weakly align with established value surveys.
arXiv Detail & Related papers (2022-03-25T15:45:49Z) - The Ghost in the Machine has an American accent: value conflict in GPT-3 [0.0]
We discuss how the co-creation of language and cultural value impacts large language models.
We stress tested GPT-3 with a range of value-rich texts representing several languages and nations.
We observed when values embedded in the input text were mutated in the generated outputs.
arXiv Detail & Related papers (2022-03-15T11:06:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.