Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs
- URL: http://arxiv.org/abs/2402.17649v3
- Date: Thu, 8 Aug 2024 20:06:34 GMT
- Title: Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs
- Authors: Tanise Ceron, Neele Falk, Ana Barić, Dmitry Nikolaev, Sebastian Padó,
- Abstract summary: We propose a series of tests to assess the reliability and consistency of large language models' stances on political statements.
We study models ranging in size from 7B to 70B parameters and find that their reliability increases with parameter count.
Larger models show overall stronger alignment with left-leaning parties but differ among policy programs.
- Score: 13.036825846417006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific "worldview" and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanings (Feng et al., 2023; Motoki et al., 2024). However, it is as yet unclear whether these leanings are reliable (robust to prompt variations) and whether the leaning is consistent across policies and political leaning. We propose a series of tests which assess the reliability and consistency of LLMs' stances on political statements based on a dataset of voting-advice questionnaires collected from seven EU countries and annotated for policy issues. We study LLMs ranging in size from 7B to 70B parameters and find that their reliability increases with parameter count. Larger models show overall stronger alignment with left-leaning parties but differ among policy programs: They show a (left-wing) positive stance towards environment protection, social welfare state and liberal society but also (right-wing) law and order, with no consistent preferences in the areas of foreign policy and migration.
Related papers
- Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.
We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - When Neutral Summaries are not that Neutral: Quantifying Political Neutrality in LLM-Generated News Summaries [0.0]
This study presents a fresh perspective on quantifying the political neutrality of LLMs.
We consider five pressing issues in current US politics: abortion, gun control/rights, healthcare, immigration, and LGBTQ+ rights.
Our study reveals a consistent trend towards pro-Democratic biases in several well-known LLMs.
arXiv Detail & Related papers (2024-10-13T19:44:39Z) - Assessing Political Bias in Large Language Models [0.624709220163167]
We evaluate the political bias of open-source Large Language Models (LLMs) concerning political issues within the European Union (EU) from a German voter's perspective.
We show that larger models, such as Llama3-70B, tend to align more closely with left-leaning political parties, while smaller models often remain neutral.
arXiv Detail & Related papers (2024-05-17T15:30:18Z) - Measuring Political Bias in Large Language Models: What Is Said and How It Is Said [46.1845409187583]
We propose to measure political bias in LLMs by analyzing both the content and style of their generated content regarding political issues.
Our proposed measure looks at different political issues such as reproductive rights and climate change, at both the content (the substance of the generation) and the style (the lexical polarity) of such bias.
arXiv Detail & Related papers (2024-03-27T18:22:48Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - The Political Preferences of LLMs [0.0]
I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs.
Most conversational LLMs generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints.
I demonstrate that LLMs can be steered towards specific locations in the political spectrum through Supervised Fine-Tuning.
arXiv Detail & Related papers (2024-02-02T02:43:10Z) - Whose Opinions Do Language Models Reflect? [88.35520051971538]
We investigate the opinions reflected by language models (LMs) by leveraging high-quality public opinion polls and their associated human responses.
We find substantial misalignment between the views reflected by current LMs and those of US demographic groups.
Our analysis confirms prior observations about the left-leaning tendencies of some human feedback-tuned LMs.
arXiv Detail & Related papers (2023-03-30T17:17:08Z) - Millions of Co-purchases and Reviews Reveal the Spread of Polarization
and Lifestyle Politics across Online Markets [68.8204255655161]
We study the pervasiveness of polarization and lifestyle politics over different product segments in a diverse market.
We sample 234.6 million relations among 21.8 million market entities to find product categories that are politically relevant, aligned, and polarized.
Cultural products are 4 times more polarized than any other segment.
arXiv Detail & Related papers (2022-01-17T18:16:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.