Related papers: Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset

Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset

URL: http://arxiv.org/abs/2507.09650v1
Date: Sun, 13 Jul 2025 14:34:22 GMT
Title: Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset
Authors: Lily Hong Zhang, Smitha Milli, Karen Jusko, Jonathan Smith, Brandon Amos, Wassim, Bouaziz, Manon Revel, Jack Kussman, Lisa Titus, Bhaktipriya Radharapu, Jane Yu, Vidya Sarma, Kris Rose, Maximilian Nickel,
Abstract summary: We show that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs.<n>We argue that this motivates the need for negatively-correlated sampling when generating candidate sets.<n>We collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date.
Score: 15.639249716288953
License: http://creativecommons.org/licenses/by/4.0/
Abstract: How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.

Related papers

Fair-PP: A Synthetic Dataset for Aligning LLM with Personalized Preferences of Social Equity [33.36483739554757]
We introduce Fair-PP, a synthetic dataset of personalized preferences targeting social equity.<n>We also contribute (i) An automated framework for generating preference data, along with a more fine-grained dataset of personalized preferences; (ii) analysis of the positioning of the existing mainstream language models across five major global regions within the personalized preference space; and (iii) a sample reweighting method for personalized preference alignment.
arXiv Detail & Related papers (2025-05-17T06:02:00Z)
Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning [59.56171041796373]
We harvest multi-modal instructional data in a robust and efficient manner.<n>We take interaction style as diversity indicator and use a multi-modal rich styler to identify data instruction patterns.<n>Across 10+ experimental settings, validated by 14 multi-modal benchmarks, we demonstrate consistent improvements over random sampling, baseline strategies and state-of-the-art selection methods.
arXiv Detail & Related papers (2025-03-17T17:11:22Z)
PluralLLM: Pluralistic Alignment in LLMs via Federated Learning [7.752864126266439]
We introduce PluralLLM, a federated learning-based approach that enables multiple user groups to collaboratively train a transformer-based preference predictor without sharing sensitive data.<n>Our method leverages Federated Averaging (FedAvg) to aggregate preference updates efficiently, achieving 46% faster convergence, a 4% improvement in alignment scores, and nearly the same group fairness measure as in centralized training.
arXiv Detail & Related papers (2025-03-13T00:45:27Z)
ComPO: Community Preferences for Language Model Personalization [122.54846260663922]
ComPO is a method to personalize preference optimization in language models. We collect and release ComPRed, a question answering dataset with community-level preferences from Reddit.
arXiv Detail & Related papers (2024-10-21T14:02:40Z)
The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models [67.38144169029617]
We map the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 Large Language Models (LLMs)<n>With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts.<n>We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.
arXiv Detail & Related papers (2024-04-24T17:51:36Z)
MaxMin-RLHF: Alignment with Diverse Human Preferences [101.57443597426374]
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.<n>We learn a mixture of preference distributions via an expectation-maximization algorithm to better represent diverse human preferences.<n>Our algorithm achieves an average improvement of more than 16% in win-rates over conventional RLHF algorithms.
arXiv Detail & Related papers (2024-02-14T03:56:27Z)
On Diversified Preferences of Large Language Model Alignment [51.26149027399505]
This paper presents the first quantitative analysis of the experimental scaling law for reward models with varying sizes. Our analysis reveals that the impact of diversified human preferences depends on both model size and data size. Larger models with sufficient capacity mitigate the negative effects of diverse preferences, while smaller models struggle to accommodate them.
arXiv Detail & Related papers (2023-12-12T16:17:15Z)
Sample Efficient Preference Alignment in LLMs via Active Exploration [63.84454768573154]
We take advantage of the fact that one can often choose contexts at which to obtain human feedback to most efficiently identify a good policy.<n>We propose an active exploration algorithm to efficiently select the data and provide theoretical proof that it has a worst-case regret bound.<n>Our method outperforms the baselines with limited samples of human preferences on several language models and four real-world datasets.
arXiv Detail & Related papers (2023-12-01T00:54:02Z)
On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z)
Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting [19.79214899011072]
This paper formalizes diversity of representation in generative large language models. We present evaluation datasets and propose metrics to measure diversity in generated responses along people and culture axes. We find that LLMs understand the notion of diversity, and that they can reason and critique their own responses for that goal.
arXiv Detail & Related papers (2023-10-25T10:17:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.