Related papers: ACE-Align: Attribute Causal Effect Alignment for Cultural Values under Varying Persona Granularities

ACE-Align: Attribute Causal Effect Alignment for Cultural Values under Varying Persona Granularities

URL: http://arxiv.org/abs/2601.12962v1
Date: Mon, 19 Jan 2026 11:18:25 GMT
Title: ACE-Align: Attribute Causal Effect Alignment for Cultural Values under Varying Persona Granularities
Authors: Jiatang Luo, Bingbing Xu, Rongxin Chen, Xiaoyan Zhao, Yang Zhang, Liang Pang, Zhiyong Huang, Tat-Seng Chua, Huawei Shen,
Abstract summary: We propose ACE-Align, a causal-effect framework that aligns how demographic attributes shift different cultural values.<n>Across all persona granularities, ACE-Align consistently outperforms baselines.<n>It improves geographic equity by reducing the average alignment gap between high-resource and low-resource regions from 9.81 to 4.92 points.
Score: 76.52901967874622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring that large language models (LLMs) respect diverse cultural values is crucial for social equity. However, existing approaches often treat cultural groups as homogeneous and overlook within-group heterogeneity induced by intersecting demographic attributes, leading to unstable behavior under varying persona granularity. We propose ACE-Align (Attribute Causal Effect Alignment), a causal-effect framework that aligns how specific demographic attributes shift different cultural values, rather than treating each culture as a homogeneous group. We evaluate ACE-Align across 14 countries spanning five continents, with personas specified by subsets of four attributes (gender, education, residence, and marital status) and granularity instantiated by the number of specified attributes. Across all persona granularities, ACE-Align consistently outperforms baselines. Moreover, it improves geographic equity by reducing the average alignment gap between high-resource and low-resource regions from 9.81 to 4.92 points, while Africa shows the largest average gain (+8.48 points). Code is available at https://github.com/Wells-Luo/ACE-Align.

Related papers

I Am Aligned, But With Whom? MENA Values Benchmark for Evaluating Cultural Alignment and Multilingual Bias in LLMs [5.060243371992739]
We introduce MENAValues, a novel benchmark designed to evaluate the cultural alignment and multilingual biases of large language models (LLMs)<n> Drawing from large-scale, authoritative human surveys, we curate a structured dataset that captures the sociocultural landscape of MENA with population-level response distributions from 16 countries.<n>Our analysis reveals three critical phenomena: "Cross-Lingual Value Shifts" where identical questions yield drastically different responses based on language, "Reasoning-Induced Degradation" where prompting models to explain their reasoning worsens cultural alignment, and "Logit Leakage" where models refuse sensitive questions while internal probabilities reveal strong hidden
arXiv Detail & Related papers (2025-10-15T05:10:57Z)
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World [68.19795061447044]
This paper investigates cross-cultural transfer of commonsense reasoning in the Arab world.<n>Using a culturally grounded commonsense reasoning dataset covering 13 Arab countries, we evaluate lightweight alignment methods.<n>Our results show that merely 12 culture-specific examples from one country can improve performance in others by 10% on average.
arXiv Detail & Related papers (2025-09-23T17:24:14Z)
Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset [18.197532754060244]
We show that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs.<n>We argue that this motivates the need for negatively-correlated sampling when generating candidate sets.<n>We collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date.
arXiv Detail & Related papers (2025-07-13T14:34:22Z)
CAIRe: Cultural Attribution of Images by Retrieval-Augmented Evaluation [61.130639734982395]
We introduce CAIRe, a novel evaluation metric that assesses the degree of cultural relevance of an image.<n>Our framework grounds entities and concepts in the image to a knowledge base and uses factual information to give independent graded judgments for each culture label.
arXiv Detail & Related papers (2025-06-10T17:16:23Z)
CARE: Multilingual Human Preference Learning for Cultural Awareness [48.760262639641496]
We introduce textbfCARE, a multilingual resource containing 3,490 culturally specific questions and 31.7k responses with human judgments.<n>We demonstrate how a modest amount of high-quality native preferences improves cultural awareness across various LMs.<n>Our analyses reveal that models with stronger initial cultural performance benefit more from alignment.
arXiv Detail & Related papers (2025-04-07T14:57:06Z)
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare [9.087074203425061]
Existing alignment paradigms fail to account for the diversity of perspectives across cultures, demographics, and communities.<n>This is particularly critical in health-related scenarios, where plurality is essential due to the influence of culture, religion, personal values, and conflicting opinions.<n>This work highlights the limitations of current approaches and lays the groundwork for developing health-specific alignment solutions.
arXiv Detail & Related papers (2025-02-19T14:38:57Z)
Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback [8.04095222893591]
We find significant gaps in fairness preferences depending on the race, age, political stance, educational level, and LGBTQ+ identity of annotators. We also demonstrate that demographics mentioned in text have a strong influence on how users perceive individual fairness in moderation.
arXiv Detail & Related papers (2024-06-09T19:42:25Z)
D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation [5.9053106775634685]
We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
arXiv Detail & Related papers (2024-04-16T19:12:03Z)
GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods [62.076647211744564]
We propose GIVL, a Geographically Inclusive Vision-and-Language Pre-trained model. There are two attributes of geo-diverse visual concepts which can help to learn geo-diverse knowledge: 1) concepts under similar categories have unique knowledge and visual characteristics, 2) concepts with similar visual features may fall in completely different categories. Compared with similar-size models pre-trained with similar scale of data, GIVL achieves state-of-the-art (SOTA) and more balanced performance on geo-diverse V&L tasks.
arXiv Detail & Related papers (2023-01-05T03:43:45Z)
Mitigating Face Recognition Bias via Group Adaptive Classifier [53.15616844833305]
This work aims to learn a fair face representation, where faces of every group could be more equally represented. Our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.
arXiv Detail & Related papers (2020-06-13T06:43:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.