The Cultural Gene of Large Language Models: A Study on the Impact of Cross-Corpus Training on Model Values and Biases
- URL: http://arxiv.org/abs/2508.12411v1
- Date: Sun, 17 Aug 2025 15:54:14 GMT
- Title: The Cultural Gene of Large Language Models: A Study on the Impact of Cross-Corpus Training on Model Values and Biases
- Authors: Emanuel Z. Fenech-Borg, Tilen P. Meznaric-Kos, Milica D. Lekovic-Bojovic, Arni J. Hentze-Djurhuus,
- Abstract summary: Large language models (LLMs) are deployed globally, yet their underlying cultural and ethical assumptions remain underexplored.<n>We compare a Western-centric model (GPT-4) and an Eastern-centric model (ERNIE Bot)<n>Human annotation shows significant and consistent divergence across both dimensions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are deployed globally, yet their underlying cultural and ethical assumptions remain underexplored. We propose the notion of a "cultural gene" -- a systematic value orientation that LLMs inherit from their training corpora -- and introduce a Cultural Probe Dataset (CPD) of 200 prompts targeting two classic cross-cultural dimensions: Individualism-Collectivism (IDV) and Power Distance (PDI). Using standardized zero-shot prompts, we compare a Western-centric model (GPT-4) and an Eastern-centric model (ERNIE Bot). Human annotation shows significant and consistent divergence across both dimensions. GPT-4 exhibits individualistic and low-power-distance tendencies (IDV score approx 1.21; PDI score approx -1.05), while ERNIE Bot shows collectivistic and higher-power-distance tendencies (IDV approx -0.89; PDI approx 0.76); differences are statistically significant (p < 0.001). We further compute a Cultural Alignment Index (CAI) against Hofstede's national scores and find GPT-4 aligns more closely with the USA (e.g., IDV CAI approx 0.91; PDI CAI approx 0.88) whereas ERNIE Bot aligns more closely with China (IDV CAI approx 0.85; PDI CAI approx 0.81). Qualitative analyses of dilemma resolution and authority-related judgments illustrate how these orientations surface in reasoning. Our results support the view that LLMs function as statistical mirrors of their cultural corpora and motivate culturally aware evaluation and deployment to avoid algorithmic cultural hegemony.
Related papers
- Cultural Alien Sampler: Open-ended art generation balancing originality and coherence [77.30507101341111]
We introduce the Cultural Alien Sampler (CAS), a concept-selection method that separates compositional fit from cultural typicality.<n>CAS targets combinations that are high in coherence and low in typicality, yielding ideas that maintain internal consistency while deviating from learned conventions and embedded cultural context.
arXiv Detail & Related papers (2025-10-21T09:32:46Z) - CCD-Bench: Probing Cultural Conflict in Large Language Model Decision-Making [0.9310318514564272]
Large language models can navigate explicit conflicts between legitimately different cultural value systems.<n>CCD-Bench is a benchmark that assesses decision-making under cross-cultural value conflict.<n>CCD-Bench shifts evaluation beyond isolated bias detection toward pluralistic decision making.
arXiv Detail & Related papers (2025-10-03T22:55:37Z) - ALIGN: Word Association Learning for Cross-Cultural Generalization in Large Language Models [0.8999666725996975]
It remains a challenge to model and align culture due to limited cultural knowledge.<n>We introduce parameter-efficient fine-tuning on native speakers' free word-association norms.<n>Our work shows that a few million culture-grounded associations can instill value alignment without costly retraining.
arXiv Detail & Related papers (2025-08-19T00:55:20Z) - Exploring Cultural Variations in Moral Judgments with Large Language Models [0.5356944479760104]
Using log-probability-based moral justifiability scores, we correlate each model's outputs with survey data covering a broad set of ethical topics.<n>Our results show that many earlier or smaller models often produce near-zero or negative correlations with human judgments.<n> advanced instruction-tuned models (including GPT-4o and GPT-4o-mini) achieve substantially higher positive correlations, suggesting they better reflect real-world moral attitudes.
arXiv Detail & Related papers (2025-06-14T10:16:48Z) - CAIRe: Cultural Attribution of Images by Retrieval-Augmented Evaluation [61.130639734982395]
We introduce CAIRe, a novel evaluation metric that assesses the degree of cultural relevance of an image.<n>Our framework grounds entities and concepts in the image to a knowledge base and uses factual information to give independent graded judgments for each culture label.
arXiv Detail & Related papers (2025-06-10T17:16:23Z) - Cultural Value Alignment in Large Language Models: A Prompt-based Analysis of Schwartz Values in Gemini, ChatGPT, and DeepSeek [0.0]
This study examines cultural value alignment in large language models (LLMs) by analyzing how Gemini, ChatGPT, and DeepSeek prioritize values from Schwartz's value framework.<n>Results of a Bayesian ordinal regression model show that self-transcendence values (e.g., benevolence, universalism) were highly prioritized across all models.<n>DeepSeek uniquely downplayed self-enhancement values compared to ChatGPT and Gemini, aligning with collectivist cultural tendencies.
arXiv Detail & Related papers (2025-05-21T14:03:19Z) - Multimodal Cultural Safety: Evaluation Frameworks and Alignment Strategies [58.88053690412802]
Large vision-language models (LVLMs) are increasingly deployed in globally distributed applications, such as tourism assistants.<n> CROSS is a benchmark designed to assess the cultural safety reasoning capabilities of LVLMs.<n>We evaluate 21 leading LVLMs, including mixture-of-experts models and reasoning models.
arXiv Detail & Related papers (2025-05-20T23:20:38Z) - CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization [50.90288681622152]
Large Language Models (LLMs) more deeply integrate into human life across various regions.<n>Existing approaches develop culturally aligned LLMs through fine-tuning with culture-specific corpora.<n>We introduce CAReDiO, a novel cultural data construction framework.
arXiv Detail & Related papers (2025-04-09T13:40:13Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - Large language models can replicate cross-cultural differences in personality [0.0]
We use a large-scale experiment to determine whether GPT-4 can replicate cross-cultural differences in the Big Five.<n>We used the US and South Korea as the cultural pair.
arXiv Detail & Related papers (2023-10-12T11:17:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.