Aligning Language Models to User Opinions
- URL: http://arxiv.org/abs/2305.14929v1
- Date: Wed, 24 May 2023 09:11:11 GMT
- Title: Aligning Language Models to User Opinions
- Authors: EunJeong Hwang, Bodhisattwa Prasad Majumder, Niket Tandon
- Abstract summary: We find that the opinions of a user and their demographics and ideologies are not mutual predictors.
We use this insight to align LLMs by modeling both user opinions as well as user demographics and ideology.
In addition to the typical approach of prompting LLMs with demographics and ideology, we discover that utilizing the most relevant past opinions from individual users enables the model to predict user opinions more accurately.
- Score: 10.953326025836475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An important aspect of developing LLMs that interact with humans is to align
models' behavior to their users. It is possible to prompt an LLM into behaving
as a certain persona, especially a user group or ideological persona the model
captured during its pertaining stage. But, how to best align an LLM with a
specific user and not a demographic or ideological group remains an open
question. Mining public opinion surveys (by Pew Research), we find that the
opinions of a user and their demographics and ideologies are not mutual
predictors. We use this insight to align LLMs by modeling both user opinions as
well as user demographics and ideology, achieving up to 7 points accuracy gains
in predicting public opinions from survey questions across a broad set of
topics. In addition to the typical approach of prompting LLMs with demographics
and ideology, we discover that utilizing the most relevant past opinions from
individual users enables the model to predict user opinions more accurately.
Related papers
- Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.
We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Stereotype or Personalization? User Identity Biases Chatbot Recommendations [54.38329151781466]
We show that large language models (LLMs) produce recommendations that reflect both what the user wants and who the user is.
We find that models generate racially stereotypical recommendations regardless of whether the user revealed their identity intentionally.
Our experiments show that even though a user's revealed identity significantly influences model recommendations, model responses obfuscate this fact in response to user queries.
arXiv Detail & Related papers (2024-10-08T01:51:55Z) - Evaluating Large Language Model Biases in Persona-Steered Generation [26.92498998306013]
We show that large language models (LLMs) are 9.7% less steerable towards incongruous personas than congruous ones.
Models that are fine-tuned with Reinforcement Learning from Human Feedback (RLHF) are more steerable, especially towards stances associated with political liberals and women.
arXiv Detail & Related papers (2024-05-30T17:06:03Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Do LLMs exhibit human-like response biases? A case study in survey
design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all.
We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires.
Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z) - Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty.
We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z) - Whose Opinions Do Language Models Reflect? [88.35520051971538]
We investigate the opinions reflected by language models (LMs) by leveraging high-quality public opinion polls and their associated human responses.
We find substantial misalignment between the views reflected by current LMs and those of US demographic groups.
Our analysis confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs.
arXiv Detail & Related papers (2023-03-30T17:17:08Z) - Fine-tuning language models to find agreement among humans with diverse
preferences [7.702628192754256]
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user.
Here, we consider how might a machine help people with diverse views find agreement?
We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions.
We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent.
arXiv Detail & Related papers (2022-11-28T02:24:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.