Related papers: Large language models can replicate cross-cultural differences in personality

Large language models can replicate cross-cultural differences in personality

URL: http://arxiv.org/abs/2310.10679v3
Date: Sun, 26 Jan 2025 13:03:46 GMT
Title: Large language models can replicate cross-cultural differences in personality
Authors: Paweł Niszczota, Mateusz Janczak, Michał Misiak,
Abstract summary: We use a large-scale experiment to determine whether GPT-4 can replicate cross-cultural differences in the Big Five.<n>We used the US and South Korea as the cultural pair.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We use a large-scale experiment (N=8000) to determine whether GPT-4 can replicate cross-cultural differences in the Big Five, measured using the Ten-Item Personality Inventory. We used the US and South Korea as the cultural pair, given that prior research suggests substantial personality differences between people from these two countries. We manipulated the target of the simulation (US vs. Korean), the language of the inventory (English vs. Korean), and the language model (GPT-4 vs. GPT-3.5). Our results show that GPT-4 replicated the cross-cultural differences for each factor. However, mean ratings had an upward bias and exhibited lower variation than in the human samples, as well as lower structural validity. We provide preliminary evidence that LLMs can aid cross-cultural researchers and practitioners.

Related papers

CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization [50.90288681622152]
Large Language Models (LLMs) more deeply integrate into human life across various regions. Existing approaches develop culturally aligned LLMs through fine-tuning with culture-specific corpora. We introduce CAReDiO, a novel cultural data construction framework.
arXiv Detail & Related papers (2025-04-09T13:40:13Z)
When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts [15.78054683369659]
We introduce MixCuBe, a cross-cultural bias benchmark, and study elements from five countries and four ethnicities. Our findings reveal that MLLMs achieve both higher accuracy and lower sensitivity to such perturbations for high-resource cultures. GPT-4o, the best-performing model overall, shows up to 58% difference in accuracy between the original and perturbed cultural settings in low-resource cultures.
arXiv Detail & Related papers (2025-03-21T03:50:05Z)
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs [2.5212698425008377]
Large Language Models (LLMs) are becoming increasingly capable across global languages. However, the ability to communicate across languages does not necessarily translate to appropriate cultural representations. We compare two families of models: Google's Gemma models and OpenAI's turbo-series. We find no consistent relationships between language capabilities and cultural alignment.
arXiv Detail & Related papers (2025-02-23T11:02:41Z)
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs [75.82306181299153]
We introduce CulturalBench: a set of 1,227 human-written and human-verified questions for assessing cultural knowledge. We evaluate models on two setups: CulturalBench-Easy and CulturalBench-Hard which share the same questions but asked differently. Compared to human performance (92.6% accuracy), CulturalBench-Hard is more challenging for frontier LLMs with the best performing model (GPT-4o) at only 61.5% and the worst (Llama3-8b) at 21.4%.
arXiv Detail & Related papers (2024-10-03T17:04:31Z)
See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding [78.88461026069862]
Vision-language models (VLMs) can respond to queries about images in many languages. We present a novel investigation that demonstrates and localizes Western bias in image understanding.
arXiv Detail & Related papers (2024-06-17T15:49:51Z)
CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs. We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z)
CultureLLM: Incorporating Cultural Differences into Large Language Models [36.66184989869121]
CultureLLM is a cost-effective solution to incorporate cultural differences into large language models. We fine-tune culture-specific LLMs and one unified model (CultureLLM-One) for 9 cultures covering rich and low-resource languages. Our human study shows that the generated samples are semantically equivalent to the original samples.
arXiv Detail & Related papers (2024-02-09T04:02:43Z)
Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication [68.40865217231695]
This study examines the behavior of GPT-4V in replicating human-like use of emojis. The findings reveal a discernible discrepancy between human and GPT-4V behaviors, likely due to the subjective nature of human interpretation.
arXiv Detail & Related papers (2024-01-16T08:56:52Z)
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs) LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z)
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions [10.415002561977655]
This research proposes a Cultural Alignment Test (Hoftede's CAT) to quantify cultural alignment using Hofstede's cultural dimension framework. We quantitatively evaluate large language models (LLMs) against the cultural dimensions of regions like the United States, China, and Arab countries. Our results quantify the cultural alignment of LLMs and reveal the difference between LLMs in explanatory cultural dimensions.
arXiv Detail & Related papers (2023-08-25T14:50:13Z)
Comparing Biases and the Impact of Multilingual Training across Multiple Languages [70.84047257764405]
We present a bias analysis across Italian, Chinese, English, Hebrew, and Spanish on the downstream sentiment analysis task. We adapt existing sentiment bias templates in English to Italian, Chinese, Hebrew, and Spanish for four attributes: race, religion, nationality, and gender. Our results reveal similarities in bias expression such as favoritism of groups that are dominant in each language's culture.
arXiv Detail & Related papers (2023-05-18T18:15:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.