Related papers: ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning

ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning

URL: http://arxiv.org/abs/2501.01031v3
Date: Thu, 08 May 2025 01:07:15 GMT
Title: ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning
Authors: Wonduk Seo, Zonghao Yuan, Yi Bu,
Abstract summary: ValuesRAG is a novel framework that integrates cultural and demographic knowledge dynamically during text generation.<n>We evaluate ValuesRAG using 6 diverse regional datasets and show that it consistently outperforms baselines.<n>Our findings underscore the potential of dynamic retrieval-based methods to bridge the gap between global LLM capabilities and localized cultural values.
Score: 1.1343849658875087
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Ensuring cultural values alignment in Large Language Models (LLMs) remains a critical challenge, as these models often embed Western-centric biases from their training data, leading to misrepresentations and fairness concerns in cross-cultural applications. Existing approaches such as role assignment and few-shot learning struggle to address these limitations effectively due to their reliance on pre-trained knowledge, limited scalability, and inability to capture nuanced cultural values. To address these issues, we propose ValuesRAG, a novel and effective framework that applies Retrieval-Augmented Generation (RAG) with In-Context Learning (ICL) to integrate cultural and demographic knowledge dynamically during text generation. Leveraging the World Values Survey (WVS) dataset, ValuesRAG first generates summaries of values for each individual. We subsequently curate several representative regional datasets to serve as test datasets and retrieve relevant summaries of values based on demographic features, followed by a reranking step to select the top-k relevant summaries. We evaluate ValuesRAG using 6 diverse regional datasets and show that it consistently outperforms baselines: including zero-shot, role-assignment, few-shot, and hybrid methods, both in main experiments and ablation settings. Notably, ValuesRAG achieves the best overall performance over prior methods, demonstrating its effectiveness in fostering culturally aligned and inclusive AI systems. Our findings underscore the potential of dynamic retrieval-based methods to bridge the gap between global LLM capabilities and localized cultural values.

Related papers

A Game-Theoretic Negotiation Framework for Cross-Cultural Consensus in LLMs [10.655783463895325]
Large language models (LLMs) exhibit a pronounced WEIRD (Western, Educated, Industrialized, Rich, Democratic) cultural bias.<n>This monocultural perspective may reinforce dominant values and marginalize diverse cultural viewpoints.<n>We introduce a systematic framework designed to boost fair and robust cross-cultural consensus.
arXiv Detail & Related papers (2025-06-16T08:42:39Z)
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis [41.261808170896686]
CulFiT is a novel training paradigm that leverages multilingual data and fine-grained reward modeling to enhance cultural sensitivity and inclusivity.<n>Our approach synthesizes diverse cultural-related questions, constructs critique data in culturally relevant languages, and employs fine-grained rewards to decompose cultural texts into verifiable knowledge units.
arXiv Detail & Related papers (2025-05-26T04:08:26Z)
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs [57.43233760384488]
Adapting cultural values in Large Language Models (LLMs) presents significant challenges.<n>Prior work primarily aligns LLMs with different cultural values using World Values Survey (WVS) data.<n>In this paper, we investigate WVS-based training for cultural value adaptation and find that relying solely on survey data cane cultural norms and interfere with factual knowledge.
arXiv Detail & Related papers (2025-05-22T09:00:01Z)
RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding [79.44246283490665]
We introduce RAVENEA, a new benchmark designed to advance visual culture understanding through retrieval.<n>RAVENEA focuses on two tasks: culture-focused visual question answering (cVQA) and culture-informed image captioning (cIC)<n>We train and evaluate seven multimodal retrievers for each image query, and measure the downstream impact of retrieval-augmented inputs across fourteen state-of-the-art vision-language models.
arXiv Detail & Related papers (2025-05-20T14:57:16Z)
CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization [50.90288681622152]
Large Language Models (LLMs) more deeply integrate into human life across various regions. Existing approaches develop culturally aligned LLMs through fine-tuning with culture-specific corpora. We introduce CAReDiO, a novel cultural data construction framework.
arXiv Detail & Related papers (2025-04-09T13:40:13Z)
Cultural Learning-Based Culture Adaptation of Language Models [70.1063219524999]
Adapting large language models (LLMs) to diverse cultural values is a challenging task. We present CLCA, a novel framework for enhancing LLM alignment with cultural values based on cultural learning.
arXiv Detail & Related papers (2025-04-03T18:16:26Z)
Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs [7.802103248428407]
We identify and test three assumptions behind current survey-based evaluation methods. We find a high level of instability across presentation formats, incoherence between evaluated versus held-out cultural dimensions, and erratic behavior under prompt steering.
arXiv Detail & Related papers (2025-03-11T17:59:53Z)
An Investigation into Value Misalignment in LLM-Generated Texts for Cultural Heritage [5.893281327912503]
Large Language Models (LLMs) are increasingly prevalent in tasks related to cultural heritage. They are used to generate descriptions of historical monuments, translating ancient texts, preserving oral traditions, and creating educational content. However, cultural value misalignments may exist in generated texts, such as the misrepresentation of historical facts, the erosion of cultural identity, and the oversimplification of complex cultural narratives.
arXiv Detail & Related papers (2025-01-03T14:35:32Z)
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation [50.38159901496538]
Cultural biases in multilingual datasets pose significant challenges for their effectiveness as global benchmarks.<n>We show that progress on MMLU depends heavily on learning Western-centric concepts, with 28% of all questions requiring culturally sensitive knowledge.<n>We release Global-MMLU, an improved MMLU with evaluation coverage across 42 languages.
arXiv Detail & Related papers (2024-12-04T13:27:09Z)
Bottom-Up and Top-Down Analysis of Values, Agendas, and Observations in Corpora and LLMs [1.3119775978504942]
Large language models (LLMs) generate diverse, situated, persuasive texts from a plurality of potential perspectives. We seek to characterize socio-cultural values that they express, for reasons of safety, accuracy, inclusion, and cultural fidelity.
arXiv Detail & Related papers (2024-11-06T18:51:04Z)
Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture [4.467334566487944]
This study introduces a benchmark designed to evaluate the performance of large language models (LLMs) in understanding and processing cultural knowledge. The study develops a multi-dimensional framework that systematically assesses LLMs across six cognitive domains: Remembering, Understanding, Applying, Analyzing, evaluating, and Creating. The results highlight the effectiveness of RAG in improving accuracy across all cognitive domains, particularly in tasks requiring precise retrieval and application of cultural knowledge.
arXiv Detail & Related papers (2024-09-03T02:50:04Z)
Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks. We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts. We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z)
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models [38.932610459192105]
We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs) Our work underscores the value of using diverse data to create more inclusive multimodal systems.
arXiv Detail & Related papers (2024-05-22T16:04:22Z)
CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge [69.82940934994333]
We introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build challenging evaluation dataset. Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions. CULTURALBENCH-V0.1 is a compact yet high-quality evaluation dataset with users' red-teaming attempts.
arXiv Detail & Related papers (2024-04-10T00:25:09Z)
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition. Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages. Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.