GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
- URL: http://arxiv.org/abs/2407.18008v1
- Date: Thu, 25 Jul 2024 13:04:25 GMT
- Title: GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
- Authors: Jan Batzner, Volker Stocker, Stefan Schmid, Gjergji Kasneci,
- Abstract summary: We evaluate and compare the alignment of six LLMs by OpenAI, Anthropic, and Cohere with German party positions.
We conduct our prompt experiment for which we use the benchmark and sociodemographic data of leading German parliamentarians.
- Score: 20.06753067241866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LLMs are changing the way humans create and interact with content, potentially affecting citizens' political opinions and voting decisions. As LLMs increasingly shape our digital information ecosystems, auditing to evaluate biases, sycophancy, or steerability has emerged as an active field of research. In this paper, we evaluate and compare the alignment of six LLMs by OpenAI, Anthropic, and Cohere with German party positions and evaluate sycophancy based on a prompt experiment. We contribute to evaluating political bias and sycophancy in multi-party systems across major commercial LLMs. First, we develop the benchmark dataset GermanPartiesQA based on the Voting Advice Application Wahl-o-Mat covering 10 state and 1 national elections between 2021 and 2023. In our study, we find a left-green tendency across all examined LLMs. We then conduct our prompt experiment for which we use the benchmark and sociodemographic data of leading German parliamentarians to evaluate changes in LLMs responses. To differentiate between sycophancy and steerabilty, we use 'I am [politician X], ...' and 'You are [politician X], ...' prompts. Against our expectations, we do not observe notable differences between prompting 'I am' and 'You are'. While our findings underscore that LLM responses can be ideologically steered with political personas, they suggest that observed changes in LLM outputs could be better described as personalization to the given context rather than sycophancy.
Related papers
- Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.
We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections [45.84205238554709]
Large language models (LLMs) are perceived by some as having the potential to revolutionize social science research.
In this study, we examine to what extent LLM-based predictions of public opinion exhibit context-dependent biases.
We predict voting behavior in the 2024 European Parliament elections using a state-of-the-art LLM.
arXiv Detail & Related papers (2024-08-29T16:01:06Z) - Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion [45.84205238554709]
We generate a synthetic sample of personas matching the individual characteristics of the 2017 German Longitudinal Election Study respondents.
We ask the LLM GPT-3.5 to predict each respondent's vote choice and compare these predictions to the survey-based estimates.
We find that GPT-3.5 does not predict citizens' vote choice accurately, exhibiting a bias towards the Green and Left parties.
arXiv Detail & Related papers (2024-07-11T14:52:18Z) - Assessing Political Bias in Large Language Models [0.624709220163167]
We evaluate the political bias of open-source Large Language Models (LLMs) concerning political issues within the European Union (EU) from a German voter's perspective.
We show that larger models, such as Llama3-70B, tend to align more closely with left-leaning political parties, while smaller models often remain neutral.
arXiv Detail & Related papers (2024-05-17T15:30:18Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - The Political Preferences of LLMs [0.0]
I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs.
Most conversational LLMs generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints.
I demonstrate that LLMs can be steered towards specific locations in the political spectrum through Supervised Fine-Tuning.
arXiv Detail & Related papers (2024-02-02T02:43:10Z) - LLM Voting: Human Choices and AI Collective Decision Making [0.0]
This paper investigates the voting behaviors of Large Language Models (LLMs), specifically GPT-4 and LLaMA-2.
We observed that the choice of voting methods and the presentation order influenced LLM voting outcomes.
We found that varying the persona can reduce some of these biases and enhance alignment with human choices.
arXiv Detail & Related papers (2024-01-31T14:52:02Z) - Whose Opinions Do Language Models Reflect? [88.35520051971538]
We investigate the opinions reflected by language models (LMs) by leveraging high-quality public opinion polls and their associated human responses.
We find substantial misalignment between the views reflected by current LMs and those of US demographic groups.
Our analysis confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs.
arXiv Detail & Related papers (2023-03-30T17:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.