Related papers: The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting

The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting

URL: http://arxiv.org/abs/2512.12488v1
Date: Sat, 13 Dec 2025 23:11:41 GMT
Title: The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting
Authors: James Luther, Donald Brown,
Abstract summary: We use the VSM13 International Survey and Hofstede's cultural dimensions to identify the cultural alignment of popular Large Language Models (LLMs)<n>We then use cultural prompting to test the adaptability of these models to other cultures, namely China, France, India, Iran, Japan, and the United States.<n>We find that the majority of the eight LLMs tested favor the United States when the culture is not specified, with varying results when prompted for other cultures.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Culture is the bedrock of human interaction; it dictates how we perceive and respond to everyday interactions. As the field of human-computer interaction grows via the rise of generative Large Language Models (LLMs), the cultural alignment of these models become an important field of study. This work, using the VSM13 International Survey and Hofstede's cultural dimensions, identifies the cultural alignment of popular LLMs (DeepSeek-V3, V3.1, GPT-5, GPT-4.1, GPT-4, Claude Opus 4, Llama 3.1, and Mistral Large). We then use cultural prompting, or using system prompts to shift the cultural alignment of a model to a desired country, to test the adaptability of these models to other cultures, namely China, France, India, Iran, Japan, and the United States. We find that the majority of the eight LLMs tested favor the United States when the culture is not specified, with varying results when prompted for other cultures. When using cultural prompting, seven of the eight models shifted closer to the expected culture. We find that models had trouble aligning with Japan and China, despite two of the models tested originating with the Chinese company DeepSeek.

Related papers

Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models [78.19037585302475]
Large language models (LLMs) are increasingly deployed in culturally sensitive real-world tasks.<n>Existing cultural alignment approaches fail to align LLMs' broad cultural values with the specific goals of downstream tasks.<n>We propose CultureManager, a novel pipeline for task-specific cultural alignment.
arXiv Detail & Related papers (2026-02-25T23:27:18Z)
DeepSeek's WEIRD Behavior: The cultural alignment of Large Language Models and the effects of prompt language and cultural prompting [0.0]
We use Hofstede's VSM13 international surveys to understand the cultural alignment of large language models (LLMs)<n>We use a combination of prompt language and cultural prompting, a strategy that uses a system prompt to shift a model's alignment to reflect a specific country.<n>Our results show that DeepSeek-V3, V3.1, and OpenAI's GPT-5 exhibit a close alignment with the survey responses of the United States.<n>We also find that GPT-4 exhibits an alignment closer to China when prompted in English, but cultural prompting is effective in shifting this alignment closer to the United States
arXiv Detail & Related papers (2025-12-10T15:54:18Z)
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World [68.19795061447044]
This paper investigates cross-cultural transfer of commonsense reasoning in the Arab world.<n>Using a culturally grounded commonsense reasoning dataset covering 13 Arab countries, we evaluate lightweight alignment methods.<n>Our results show that merely 12 culture-specific examples from one country can improve performance in others by 10% on average.
arXiv Detail & Related papers (2025-09-23T17:24:14Z)
CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs [57.653830744706305]
CultureScope is the most comprehensive evaluation framework to date for assessing cultural understanding in large language models.<n>Inspired by the cultural iceberg theory, we design a novel dimensional schema for cultural knowledge classification.<n> Experimental results demonstrate that our method can effectively evaluate cultural understanding.
arXiv Detail & Related papers (2025-09-19T17:47:48Z)
CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection. It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs. We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z)
CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [73.94059188347582]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations. We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z)
Cultural Bias and Cultural Alignment of Large Language Models [0.9374652839580183]
We conduct a disaggregated evaluation of cultural bias for five widely used large language models. All models exhibit cultural values resembling English-speaking and Protestant European countries. We suggest using cultural prompting and ongoing evaluation to reduce cultural bias in the output of generative AI.
arXiv Detail & Related papers (2023-11-23T16:45:56Z)
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs) LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z)
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions [10.415002561977655]
This research proposes a Cultural Alignment Test (Hoftede's CAT) to quantify cultural alignment using Hofstede's cultural dimension framework. We quantitatively evaluate large language models (LLMs) against the cultural dimensions of regions like the United States, China, and Arab countries. Our results quantify the cultural alignment of LLMs and reveal the difference between LLMs in explanatory cultural dimensions.
arXiv Detail & Related papers (2023-08-25T14:50:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.