Probing Pre-Trained Language Models for Cross-Cultural Differences in
Values
- URL: http://arxiv.org/abs/2203.13722v2
- Date: Thu, 6 Apr 2023 22:40:31 GMT
- Title: Probing Pre-Trained Language Models for Cross-Cultural Differences in
Values
- Authors: Arnav Arora, Lucie-Aim\'ee Kaffee, Isabelle Augenstein
- Abstract summary: We introduce probes to study which values across cultures are embedded in Pre-Trained Language models.
We find that PTLMs capture differences in values across cultures, but those only weakly align with established value surveys.
- Score: 42.45033681054207
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language embeds information about social, cultural, and political values
people hold. Prior work has explored social and potentially harmful biases
encoded in Pre-Trained Language models (PTLMs). However, there has been no
systematic study investigating how values embedded in these models vary across
cultures. In this paper, we introduce probes to study which values across
cultures are embedded in these models, and whether they align with existing
theories and cross-cultural value surveys. We find that PTLMs capture
differences in values across cultures, but those only weakly align with
established value surveys. We discuss implications of using mis-aligned models
in cross-cultural settings, as well as ways of aligning PTLMs with value
surveys.
Related papers
- Exploring Large Language Models on Cross-Cultural Values in Connection with Training Methodology [4.079147243688765]
Large language models (LLMs) closely interact with humans, and need an intimate understanding of the cultural values of human society.
Our analysis shows that LLMs can judge socio-cultural norms similar to humans but less so on social systems and progress.
Increasing model size helps a better understanding of social values, but smaller models can be enhanced by using synthetic data.
arXiv Detail & Related papers (2024-12-12T00:52:11Z) - LLMs as mirrors of societal moral standards: reflection of cultural divergence and agreement across ethical topics [0.5852077003870417]
Large language models (LLMs) have become increasingly pivotal in various domains due to the recent advancements in their performance capabilities.
This study investigates whether LLMs accurately reflect cross-cultural variations and similarities in moral perspectives.
arXiv Detail & Related papers (2024-12-01T20:39:42Z) - Large Language Models as Mirrors of Societal Moral Standards [0.5852077003870417]
Language models can, to a limited extent, represent moral norms in a variety of cultural contexts.
This study evaluates the effectiveness of these models using information from two surveys, the WVS and the PEW, that encompass moral perspectives from over 40 countries.
The results show that biases exist in both monolingual and multilingual models, and they typically fall short of accurately capturing the moral intricacies of diverse cultures.
arXiv Detail & Related papers (2024-12-01T20:20:35Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [73.94059188347582]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z) - Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures.
We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references.
We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions [10.415002561977655]
This research proposes a Cultural Alignment Test (Hoftede's CAT) to quantify cultural alignment using Hofstede's cultural dimension framework.
We quantitatively evaluate large language models (LLMs) against the cultural dimensions of regions like the United States, China, and Arab countries.
Our results quantify the cultural alignment of LLMs and reveal the difference between LLMs in explanatory cultural dimensions.
arXiv Detail & Related papers (2023-08-25T14:50:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.