Aligning Language Models to User Opinions
- URL: http://arxiv.org/abs/2305.14929v1
- Date: Wed, 24 May 2023 09:11:11 GMT
- Title: Aligning Language Models to User Opinions
- Authors: EunJeong Hwang, Bodhisattwa Prasad Majumder, Niket Tandon
- Abstract summary: We find that the opinions of a user and their demographics and ideologies are not mutual predictors.
We use this insight to align LLMs by modeling both user opinions as well as user demographics and ideology.
In addition to the typical approach of prompting LLMs with demographics and ideology, we discover that utilizing the most relevant past opinions from individual users enables the model to predict user opinions more accurately.
- Score: 10.953326025836475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An important aspect of developing LLMs that interact with humans is to align
models' behavior to their users. It is possible to prompt an LLM into behaving
as a certain persona, especially a user group or ideological persona the model
captured during its pertaining stage. But, how to best align an LLM with a
specific user and not a demographic or ideological group remains an open
question. Mining public opinion surveys (by Pew Research), we find that the
opinions of a user and their demographics and ideologies are not mutual
predictors. We use this insight to align LLMs by modeling both user opinions as
well as user demographics and ideology, achieving up to 7 points accuracy gains
in predicting public opinions from survey questions across a broad set of
topics. In addition to the typical approach of prompting LLMs with demographics
and ideology, we discover that utilizing the most relevant past opinions from
individual users enables the model to predict user opinions more accurately.
Related papers
- Evaluating Large Language Model Biases in Persona-Steered Generation [26.92498998306013]
We show that large language models (LLMs) are 9.7% less steerable towards incongruous personas than congruous ones.
Models that are fine-tuned with Reinforcement Learning from Human Feedback (RLHF) are more steerable, especially towards stances associated with political liberals and women.
arXiv Detail & Related papers (2024-05-30T17:06:03Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Do LLMs exhibit human-like response biases? A case study in survey
design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all.
We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires.
Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z) - Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty.
We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z) - Whose Opinions Do Language Models Reflect? [88.35520051971538]
We investigate the opinions reflected by language models (LMs) by leveraging high-quality public opinion polls and their associated human responses.
We find substantial misalignment between the views reflected by current LMs and those of US demographic groups.
Our analysis confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs.
arXiv Detail & Related papers (2023-03-30T17:17:08Z) - Can ChatGPT Assess Human Personalities? A General Evaluation Framework [70.90142717649785]
Large Language Models (LLMs) have produced impressive results in various areas, but their potential human-like psychology is still largely unexplored.
This paper presents a generic evaluation framework for LLMs to assess human personalities based on Myers Briggs Type Indicator (MBTI) tests.
arXiv Detail & Related papers (2023-03-01T06:16:14Z) - Fine-tuning language models to find agreement among humans with diverse
preferences [7.702628192754256]
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user.
Here, we consider how might a machine help people with diverse views find agreement?
We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions.
We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent.
arXiv Detail & Related papers (2022-11-28T02:24:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.