The PRISM Alignment Project: What Participatory, Representative and   Individualised Human Feedback Reveals About the Subjective and Multicultural   Alignment of Large Language Models
        - URL: http://arxiv.org/abs/2404.16019v1
- Date: Wed, 24 Apr 2024 17:51:36 GMT
- Title: The PRISM Alignment Project: What Participatory, Representative and   Individualised Human Feedback Reveals About the Subjective and Multicultural   Alignment of Large Language Models
- Authors: Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale, 
- Abstract summary: We introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries.
 PRISM contributes (i) wide geographic and demographic participation in human feedback data; (ii) two census-representative samples for understanding collective welfare (UK and US); and (iii) individualised feedback where every rating is linked to a detailed participant profile.
- Score: 67.38144169029617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Human feedback plays a central role in the alignment of Large Language Models (LLMs). However, open questions remain about the methods (how), domains (where), people (who) and objectives (to what end) of human feedback collection. To navigate these questions, we introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 LLMs. PRISM contributes (i) wide geographic and demographic participation in human feedback data; (ii) two census-representative samples for understanding collective welfare (UK and US); and (iii) individualised feedback where every rating is linked to a detailed participant profile, thus permitting exploration of personalisation and attribution of sample artefacts. We focus on collecting conversations that centre subjective and multicultural perspectives on value-laden and controversial topics, where we expect the most interpersonal and cross-cultural disagreement. We demonstrate the usefulness of PRISM via three case studies of dialogue diversity, preference diversity, and welfare outcomes, showing that it matters which humans set alignment norms. As well as offering a rich community resource, we advocate for broader participation in AI development and a more inclusive approach to technology design. 
 
      
        Related papers
        - Cultivating Pluralism In Algorithmic Monoculture: The Community   Alignment Dataset [15.639249716288953]
 We show that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs.<n>We argue that this motivates the need for negatively-correlated sampling when generating candidate sets.<n>We collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date.
 arXiv  Detail & Related papers  (2025-07-13T14:34:22Z)
- Human Preferences for Constructive Interactions in Language Model   Alignment [0.0]
 We examined how linguistic attributes linked to constructive interactions are reflected in human preference data used for training AI.
We found that users consistently preferred well-reasoned and nuanced responses while rejecting those high in personal storytelling.
 arXiv  Detail & Related papers  (2025-03-05T15:08:41Z)
- Evaluating Cultural Adaptability of a Large Language Model via   Simulation of Synthetic Personas [4.0937229334408185]
 We employ GPT-3.5 to reproduce reactions to persuasive news articles from 7,286 participants from 15 countries.
Our analysis shows that specifying a person's country of residence improves GPT-3.5's alignment with their responses.
In contrast, using native language prompting introduces shifts that significantly reduce overall alignment.
 arXiv  Detail & Related papers  (2024-08-13T14:32:43Z)
- Vision-Language Models under Cultural and Inclusive Considerations [53.614528867159706]
 Large vision-language models (VLMs) can assist visually impaired people by describing images from their daily lives.
Current evaluation datasets may not reflect diverse cultural user backgrounds or the situational context of this use case.
We create a survey to determine caption preferences and propose a culture-centric evaluation benchmark by filtering VizWiz, an existing dataset with images taken by people who are blind.
We then evaluate several VLMs, investigating their reliability as visual assistants in a culturally diverse setting.
 arXiv  Detail & Related papers  (2024-07-08T17:50:00Z)
- Language Model Alignment in Multilingual Trolley Problems [138.5684081822807]
 Building on the Moral Machine experiment, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP.
Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions.
We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems.
 arXiv  Detail & Related papers  (2024-07-02T14:02:53Z)
- Whose Preferences? Differences in Fairness Preferences and Their Impact   on the Fairness of AI Utilizing Human Feedback [8.04095222893591]
 We find significant gaps in fairness preferences depending on the race, age, political stance, educational level, and LGBTQ+ identity of annotators.
We also demonstrate that demographics mentioned in text have a strong influence on how users perceive individual fairness in moderation.
 arXiv  Detail & Related papers  (2024-06-09T19:42:25Z)
- CIVICS: Building a Dataset for Examining Culturally-Informed Values in   Large Language Models [59.22460740026037]
 "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
 arXiv  Detail & Related papers  (2024-05-22T20:19:10Z)
- D3CODE: Disentangling Disagreements in Data across Cultures on   Offensiveness Detection and Evaluation [5.9053106775634685]
 We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators.
The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity.
Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
 arXiv  Detail & Related papers  (2024-04-16T19:12:03Z)
- Investigating Cultural Alignment of Large Language Models [10.738300803676655]
 We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures.
We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references.
We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
 arXiv  Detail & Related papers  (2024-02-20T18:47:28Z)
- On the steerability of large language models toward data-driven personas [98.9138902560793]
 Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
 arXiv  Detail & Related papers  (2023-11-08T19:01:13Z)
- UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
 We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
 arXiv  Detail & Related papers  (2023-10-02T17:40:01Z)
- Towards Measuring the Representation of Subjective Global Opinions in   Language Models [26.999751306332165]
 Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues.
We develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to.
We release our dataset for others to use and build on.
 arXiv  Detail & Related papers  (2023-06-28T17:31:53Z)
- The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked
  Emotions, Cross-Cultural Humour, and Personalisation [69.13075715686622]
 MuSe 2023 is a set of shared tasks addressing three different contemporary multimodal affect and sentiment analysis problems.
MuSe 2023 seeks to bring together a broad audience from different research communities.
 arXiv  Detail & Related papers  (2023-05-05T08:53:57Z)
- Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
  for Personality Assessment [50.15466026089435]
 We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
 arXiv  Detail & Related papers  (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.