The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
- URL: http://arxiv.org/abs/2404.16019v2
- Date: Tue, 03 Dec 2024 16:18:10 GMT
- Title: The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
- Authors: Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale,
- Abstract summary: We map the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 Large Language Models (LLMs)
With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts.
We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.
- Score: 67.38144169029617
- License:
- Abstract: Human feedback is central to the alignment of Large Language Models (LLMs). However, open questions remain about methods (how), domains (where), people (who) and objectives (to what end) of feedback processes. To navigate these questions, we introduce PRISM, a dataset that maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 LLMs. With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts. We target subjective and multicultural perspectives on value-laden and controversial issues, where we expect interpersonal and cross-cultural disagreement. We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.
Related papers
- Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas [4.0937229334408185]
We employ GPT-3.5 to reproduce reactions to persuasive news articles from 7,286 participants from 15 countries.
Our analysis shows that specifying a person's country of residence improves GPT-3.5's alignment with their responses.
In contrast, using native language prompting introduces shifts that significantly reduce overall alignment.
arXiv Detail & Related papers (2024-08-13T14:32:43Z) - Vision-Language Models under Cultural and Inclusive Considerations [53.614528867159706]
Large vision-language models (VLMs) can assist visually impaired people by describing images from their daily lives.
Current evaluation datasets may not reflect diverse cultural user backgrounds or the situational context of this use case.
We create a survey to determine caption preferences and propose a culture-centric evaluation benchmark by filtering VizWiz, an existing dataset with images taken by people who are blind.
We then evaluate several VLMs, investigating their reliability as visual assistants in a culturally diverse setting.
arXiv Detail & Related papers (2024-07-08T17:50:00Z) - Language Model Alignment in Multilingual Trolley Problems [138.5684081822807]
Building on the Moral Machine experiment, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP.
Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions.
We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems.
arXiv Detail & Related papers (2024-07-02T14:02:53Z) - Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback [8.04095222893591]
We find significant gaps in fairness preferences depending on the race, age, political stance, educational level, and LGBTQ+ identity of annotators.
We also demonstrate that demographics mentioned in text have a strong influence on how users perceive individual fairness in moderation.
arXiv Detail & Related papers (2024-06-09T19:42:25Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation [5.9053106775634685]
We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators.
The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity.
Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
arXiv Detail & Related papers (2024-04-16T19:12:03Z) - Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures.
We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references.
We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Towards Measuring the Representation of Subjective Global Opinions in Language Models [26.999751306332165]
Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues.
We develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to.
We release our dataset for others to use and build on.
arXiv Detail & Related papers (2023-06-28T17:31:53Z) - The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked
Emotions, Cross-Cultural Humour, and Personalisation [69.13075715686622]
MuSe 2023 is a set of shared tasks addressing three different contemporary multimodal affect and sentiment analysis problems.
MuSe 2023 seeks to bring together a broad audience from different research communities.
arXiv Detail & Related papers (2023-05-05T08:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.