Related papers: Measuring Political Stance and Consistency in Large Language Models

Measuring Political Stance and Consistency in Large Language Models

URL: http://arxiv.org/abs/2601.17016v1
Date: Thu, 15 Jan 2026 06:12:40 GMT
Title: Measuring Political Stance and Consistency in Large Language Models
Authors: Salah Feras Alali, Mohammad Nashat Maasfeh, Mucahid Kutlu, Saban Kardas,
Abstract summary: We assess the stances of nine Large Language Models on 24 politically sensitive issues using five prompting techniques.<n>We find that models often adopt opposing stances on several issues; some positions are malleable under prompting, while others remain stable.
Score: 1.1296803881058548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the incredible advancements in Large Language Models (LLMs), many people have started using them to satisfy their information needs. However, utilizing LLMs might be problematic for political issues where disagreement is common and model outputs may reflect training-data biases or deliberate alignment choices. To better characterize such behavior, we assess the stances of nine LLMs on 24 politically sensitive issues using five prompting techniques. We find that models often adopt opposing stances on several issues; some positions are malleable under prompting, while others remain stable. Among the models examined, Grok-3-mini is the most persistent, whereas Mistral-7B is the least. For issues involving countries with different languages, models tend to support the side whose language is used in the prompt. Notably, no prompting technique alters model stances on the Qatar blockade or the oppression of Palestinians. We hope these findings raise user awareness when seeking political guidance from LLMs and encourage developers to address these concerns.

Related papers

Are LLMs Good Safety Agents or a Propaganda Engine? [74.88607730071483]
PSP is a dataset built specifically to probe the refusal behaviors in Large Language Models from an explicitly political context.<n> PSP is built by formatting existing censored content from two data sources, openly available on the internet: sensitive prompts in China generalized to multiple countries, and tweets that have been censored in various countries.<n>We study: 1) impact of political sensitivity in seven LLMs through data-driven (making PSP implicit) and representation-level approaches (erasing the concept of politics); and, 2) vulnerability of models on PSP through prompt injection attacks (PIAs)
arXiv Detail & Related papers (2025-11-28T13:36:00Z)
Multilingual Political Views of Large Language Models: Identification and Steering [11.071018930042909]
Large language models (LLMs) are increasingly used in everyday tools and applications, raising concerns about their potential influence on political views.<n>We evaluate seven models across 14 languages using the Political Compass Test with 11 semantically equivalent paraphrases per statement to ensure robust measurement.<n>Our results reveal that larger models consistently shift toward libertarian-left positions, with significant variations across languages and model families.
arXiv Detail & Related papers (2025-07-30T12:42:35Z)
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance [49.605835017477034]
IssueBench is a set of 2.49m realistic English-language prompts to measure issue bias in large language models.<n>Using IssueBench, we show that issue biases are common and persistent in 10 state-of-the-art LLMs.
arXiv Detail & Related papers (2025-02-12T13:37:03Z)
Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Are Large Language Models Consistent over Value-laden Questions? [45.37331974356809]
Large language models (LLMs) appear to bias their survey answers toward certain values. We define value consistency as the similarity of answers across paraphrases, use-cases, translations, and within a topic. Unlike prior work, we find that models are relatively consistent across paraphrases, use-cases, translations, and within a topic.
arXiv Detail & Related papers (2024-07-03T10:53:54Z)
Large Language Models' Detection of Political Orientation in Newspapers [0.0]
Various methods have been developed to better understand newspapers' positioning. The advent of Large Language Models (LLM) hold disruptive potential to assist researchers and citizens alike. We compare how four widely employed LLMs rate the positioning of newspapers, and compare if their answers align with one another. Over a woldwide dataset, articles in newspapers are positioned strikingly differently by single LLMs, hinting to inconsistent training or excessive randomness in the algorithms.
arXiv Detail & Related papers (2024-05-23T06:18:03Z)
Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics. Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues. The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z)
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models. We show that models give substantively different answers when not forced. We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z)
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations [62.91799637259657]
Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond? We study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations. We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors.
arXiv Detail & Related papers (2023-11-30T18:53:13Z)
This Land is {Your, My} Land: Evaluating Geopolitical Biases in Language Models [40.61046400448044]
We show that large language models (LLM) recall certain geographical knowledge inconsistently when queried in different languages. As a targeted case study, we consider territorial disputes, an inherently controversial and multilingual task. We propose a suite of evaluation metrics to precisely quantify bias and consistency in responses across different languages.
arXiv Detail & Related papers (2023-05-24T01:16:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.