Do Large Language Models Understand Morality Across Cultures?
- URL: http://arxiv.org/abs/2507.21319v1
- Date: Mon, 28 Jul 2025 20:25:36 GMT
- Title: Do Large Language Models Understand Morality Across Cultures?
- Authors: Hadi Mohammadi, Yasmeen F. S. S. Meijer, Efthymia Papadopoulou, Ayoub Bagheri,
- Abstract summary: This study investigates the extent to which large language models capture cross-cultural differences and similarities in moral perspectives.<n>Our results reveal that current LLMs often fail to reproduce the full spectrum of cross-cultural moral variation.<n>These findings highlight a pressing need for more robust approaches to mitigate biases and improve cultural representativeness in LLMs.
- Score: 0.5356944479760104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in large language models (LLMs) have established them as powerful tools across numerous domains. However, persistent concerns about embedded biases, such as gender, racial, and cultural biases arising from their training data, raise significant questions about the ethical use and societal consequences of these technologies. This study investigates the extent to which LLMs capture cross-cultural differences and similarities in moral perspectives. Specifically, we examine whether LLM outputs align with patterns observed in international survey data on moral attitudes. To this end, we employ three complementary methods: (1) comparing variances in moral scores produced by models versus those reported in surveys, (2) conducting cluster alignment analyses to assess correspondence between country groupings derived from LLM outputs and survey data, and (3) directly probing models with comparative prompts using systematically chosen token pairs. Our results reveal that current LLMs often fail to reproduce the full spectrum of cross-cultural moral variation, tending to compress differences and exhibit low alignment with empirical survey patterns. These findings highlight a pressing need for more robust approaches to mitigate biases and improve cultural representativeness in LLMs. We conclude by discussing the implications for the responsible development and global deployment of LLMs, emphasizing fairness and ethical alignment.
Related papers
- Exploring Cultural Variations in Moral Judgments with Large Language Models [0.5356944479760104]
Using log-probability-based moral justifiability scores, we correlate each model's outputs with survey data covering a broad set of ethical topics.<n>Our results show that many earlier or smaller models often produce near-zero or negative correlations with human judgments.<n> advanced instruction-tuned models (including GPT-4o and GPT-4o-mini) achieve substantially higher positive correlations, suggesting they better reflect real-world moral attitudes.
arXiv Detail & Related papers (2025-06-14T10:16:48Z) - MLLMs are Deeply Affected by Modality Bias [158.64371871084478]
Recent advances in Multimodal Large Language Models (MLLMs) have shown promising results in integrating diverse modalities such as texts and images.<n>MLLMs are heavily influenced by modality bias, often relying on language while under-utilizing other modalities like visual inputs.<n>This paper argues that MLLMs are deeply affected by modality bias, highlighting its manifestations across various tasks.
arXiv Detail & Related papers (2025-05-24T11:49:31Z) - From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs [57.43233760384488]
Adapting cultural values in Large Language Models (LLMs) presents significant challenges.<n>Prior work primarily aligns LLMs with different cultural values using World Values Survey (WVS) data.<n>In this paper, we investigate WVS-based training for cultural value adaptation and find that relying solely on survey data cane cultural norms and interfere with factual knowledge.
arXiv Detail & Related papers (2025-05-22T09:00:01Z) - Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations [8.769839351949997]
Large Language Models (LLMs) are capable of generating opinions and propagating bias unknowingly.<n>Our work proposes a novel method that quantitatively analyzes the opinions generated by LLMs.<n>We evaluate modern, open LLMs such as Llama and Mistral on surveys conducted in various global south countries.
arXiv Detail & Related papers (2025-03-10T16:32:03Z) - Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English [66.97110551643722]
We investigate dialectal disparities in Large Language Models (LLMs) reasoning tasks.<n>We find that LLMs produce less accurate responses and simpler reasoning chains and explanations for AAE inputs.<n>These findings highlight systematic differences in how LLMs process and reason about different language varieties.
arXiv Detail & Related papers (2025-03-06T05:15:34Z) - Fairness in LLM-Generated Surveys [0.5720786928479238]
Large Language Models (LLMs) excel in text generation and understanding, especially simulating socio-political and economic patterns.<n>This study examines how LLMs perform across diverse populations by analyzing public surveys from Chile and the United States.<n>Political identity and race significantly influence prediction accuracy, while in Chile, gender, education, and religious affiliation play more pronounced roles.
arXiv Detail & Related papers (2025-01-25T23:42:20Z) - LLMs as mirrors of societal moral standards: reflection of cultural divergence and agreement across ethical topics [0.5852077003870417]
Large language models (LLMs) have become increasingly pivotal in various domains due to the recent advancements in their performance capabilities.<n>This study investigates whether LLMs accurately reflect cross-cultural variations and similarities in moral perspectives.
arXiv Detail & Related papers (2024-12-01T20:39:42Z) - Large Language Models as Mirrors of Societal Moral Standards [0.5852077003870417]
Language models can, to a limited extent, represent moral norms in a variety of cultural contexts.<n>This study evaluates the effectiveness of these models using information from two surveys, the WVS and the PEW, that encompass moral perspectives from over 40 countries.<n>The results show that biases exist in both monolingual and multilingual models, and they typically fall short of accurately capturing the moral intricacies of diverse cultures.
arXiv Detail & Related papers (2024-12-01T20:20:35Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Dive into the Chasm: Probing the Gap between In- and Cross-Topic
Generalization [66.4659448305396]
This study analyzes various LMs with three probing-based experiments to shed light on the reasons behind the In- vs. Cross-Topic generalization gap.
We demonstrate, for the first time, that generalization gaps and the robustness of the embedding space vary significantly across LMs.
arXiv Detail & Related papers (2024-02-02T12:59:27Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.