Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense
- URL: http://arxiv.org/abs/2405.04655v1
- Date: Tue, 7 May 2024 20:28:34 GMT
- Title: Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense
- Authors: Siqi Shen, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Soujanya Poria, Rada Mihalcea,
- Abstract summary: Large language models (LLMs) have demonstrated substantial commonsense understanding.
This paper examines the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks.
- Score: 98.09670425244462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated substantial commonsense understanding through numerous benchmark evaluations. However, their understanding of cultural commonsense remains largely unexamined. In this paper, we conduct a comprehensive examination of the capabilities and limitations of several state-of-the-art LLMs in the context of cultural commonsense tasks. Using several general and cultural commonsense benchmarks, we find that (1) LLMs have a significant discrepancy in performance when tested on culture-specific commonsense knowledge for different cultures; (2) LLMs' general commonsense capability is affected by cultural context; and (3) The language used to query the LLMs can impact their performance on cultural-related tasks. Our study points to the inherent bias in the cultural understanding of LLMs and provides insights that can help develop culturally aware language models.
Related papers
- All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages [73.93600813999306]
ALM-bench is the largest and most comprehensive effort to date for evaluating LMMs across 100 languages.
It challenges existing models by testing their ability to understand and reason about culturally diverse images paired with text in various languages.
The benchmark offers a robust and nuanced evaluation framework featuring various question formats, including true/false, multiple choice, and open-ended questions.
arXiv Detail & Related papers (2024-11-25T15:44:42Z) - Survey of Cultural Awareness in Language Models: Text and Beyond [39.77033652289063]
Large-scale deployment of large language models (LLMs) in various applications requires LLMs to be culturally sensitive to the user to ensure inclusivity.
Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive.
arXiv Detail & Related papers (2024-10-30T16:37:50Z) - Evaluating Cultural Awareness of LLMs for Yoruba, Malayalam, and English [1.3359598694842185]
We explore the ability of various LLMs to comprehend the cultural aspects of two regional languages: Malayalam (state of Kerala, India) and Yoruba (West Africa)
We demonstrate that although LLMs show a high cultural similarity for English, they fail to capture the cultural nuances across these 6 metrics for Malayalam and Yoruba.
This will have huge implications for enhancing the user experience of chat-based LLMs and also improving the validity of large-scale LLM agent-based market research.
arXiv Detail & Related papers (2024-09-14T02:21:17Z) - Methodology of Adapting Large English Language Models for Specific Cultural Contexts [10.151487049108626]
We propose a rapid adaptation method for large models in specific cultural contexts.
The adapted LLM significantly enhances its capabilities in domain-specific knowledge and adaptability to safety values.
arXiv Detail & Related papers (2024-06-26T09:16:08Z) - Translating Across Cultures: LLMs for Intralingual Cultural Adaptation [12.5954253354303]
We define the task of cultural adaptation and create an evaluation framework to evaluate the performance of modern LLMs.
We analyze possible issues with automatic adaptation.
We hope that this paper will offer more insight into the cultural understanding of LLMs and their creativity in cross-cultural scenarios.
arXiv Detail & Related papers (2024-06-20T17:06:58Z) - CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [73.94059188347582]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z) - Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge [47.57055368312541]
We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and variations in food practices.
We analyze LLMs across various architectures and configurations, evaluating their performance in both monolingual and multilingual settings.
arXiv Detail & Related papers (2024-04-10T08:49:27Z) - CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge [69.82940934994333]
We introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build challenging evaluation dataset.
Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions.
CULTURALBENCH-V0.1 is a compact yet high-quality evaluation dataset with users' red-teaming attempts.
arXiv Detail & Related papers (2024-04-10T00:25:09Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.