CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting
- URL: http://arxiv.org/abs/2404.10199v5
- Date: Tue, 20 Aug 2024 06:53:45 GMT
- Title: CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting
- Authors: Huihan Li, Liwei Jiang, Jena D. Hwang, Hyunwoo Kim, Sebastin Santy, Taylor Sorensen, Bill Yuchen Lin, Nouha Dziri, Xiang Ren, Yejin Choi,
- Abstract summary: We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
- Score: 73.94059188347582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the utilization of large language models (LLMs) has proliferated world-wide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations, and extract symbols from these generations that are associated to each culture by the LLM. We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures. We also discover that LLMs have an uneven degree of diversity in the culture symbols, and that cultures from different geographic regions have different presence in LLMs' culture-agnostic generation. Our findings promote further research in studying the knowledge and fairness of global culture perception in LLMs. Code and Data can be found here: https://github.com/huihanlhh/Culture-Gen/
Related papers
- Survey of Cultural Awareness in Language Models: Text and Beyond [39.77033652289063]
Large-scale deployment of large language models (LLMs) in various applications requires LLMs to be culturally sensitive to the user to ensure inclusivity.
Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive.
arXiv Detail & Related papers (2024-10-30T16:37:50Z) - Self-Pluralising Culture Alignment for Large Language Models [36.689491885394034]
We propose CultureSPA, a framework that allows large language models to align to pluralistic cultures.
By comparing culture-aware/unaware outputs, we are able to detect and collect culture-related instances.
Extensive experiments demonstrate that CultureSPA significantly improves the alignment of LLMs to diverse cultures without compromising general abilities.
arXiv Detail & Related papers (2024-10-16T19:06:08Z) - Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models [4.771099208181585]
LLMs are increasingly deployed in global applications, ensuring users from diverse backgrounds feel respected and understood.
Cultural harm can arise when these models fail to align with specific cultural norms, resulting in misrepresentations or violations of cultural values.
We present two key contributions: A cultural harm test dataset, created to assess model outputs across different cultural contexts through scenarios that expose potential cultural insensitivities, and a culturally aligned preference dataset, aimed at restoring cultural sensitivity through fine-tuning based on feedback from diverse annotators.
arXiv Detail & Related papers (2024-10-15T18:13:10Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense [98.09670425244462]
Large language models (LLMs) have demonstrated substantial commonsense understanding.
This paper examines the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks.
arXiv Detail & Related papers (2024-05-07T20:28:34Z) - Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge [47.57055368312541]
We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and variations in food practices.
We analyze LLMs across various architectures and configurations, evaluating their performance in both monolingual and multilingual settings.
arXiv Detail & Related papers (2024-04-10T08:49:27Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions [10.415002561977655]
This research proposes a Cultural Alignment Test (Hoftede's CAT) to quantify cultural alignment using Hofstede's cultural dimension framework.
We quantitatively evaluate large language models (LLMs) against the cultural dimensions of regions like the United States, China, and Arab countries.
Our results quantify the cultural alignment of LLMs and reveal the difference between LLMs in explanatory cultural dimensions.
arXiv Detail & Related papers (2023-08-25T14:50:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.