Related papers: A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models

A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models

URL: http://arxiv.org/abs/2506.22493v2
Date: Tue, 29 Jul 2025 08:42:14 GMT
Title: A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models
Authors: Sadia Kamal, Lalu Prasad Yadav Prakash, S M Rafiuddin, Mohammed Rakib, Arunkumar Bagavathi, Atriya Sen, Sagnik Ray Choudhury,
Abstract summary: Political Compass Test (PCT) or similar questionnaires have been used to quantify LLM's political leanings.<n> variation in standard generation parameters does not significantly impact the models' PCT scores.<n> external factors such as prompt variations and fine-tuning individually and in combination affect the same.
Score: 2.772531840826229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Political Compass Test (PCT) or similar questionnaires have been used to quantify LLM's political leanings. Building on a recent line of work that examines the validity of PCT tests, we demonstrate that variation in standard generation parameters does not significantly impact the models' PCT scores. However, external factors such as prompt variations and fine-tuning individually and in combination affect the same. Finally, we demonstrate that when models are fine-tuned on text datasets with higher political content than others, the PCT scores are not differentially affected. This calls for a thorough investigation into the validity of PCT and similar tests, as well as the mechanism by which political leanings are encoded in LLMs.

Related papers

Relative Entropy Pathwise Policy Optimization [56.86405621176669]
We show how to construct a value-gradient driven, on-policy algorithm that allow training Q-value models purely from on-policy data.<n>We propose Relative Entropy Pathwise Policy Optimization (REPPO), an efficient on-policy algorithm that combines the sample-efficiency of pathwise policy gradients with the simplicity and minimal memory footprint of standard on-policy learning.
arXiv Detail & Related papers (2025-07-15T06:24:07Z)
Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification [4.352835414206441]
Political biases encoded by LLMs might have detrimental effects on downstream applications.<n>We propose a new approach leveraging the observation that LLM sentiment predictions vary with the target entity in the same sentence.<n>We insert 1319 demographically and politically diverse politician names in 450 political sentences and predict target-oriented sentiment using seven models in six widely spoken languages.
arXiv Detail & Related papers (2025-05-26T10:01:24Z)
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models [4.8869340671593475]
Political bias in prompt-based language models can affect their performance.<n>We build on survey design principles to test a wide variety of input prompts, while taking into account prompt sensitivity.<n>We compute political bias profiles across different prompt variations and find that measures of political bias are often unstable.
arXiv Detail & Related papers (2025-03-20T13:51:06Z)
The Impact of Persona-based Political Perspectives on Hateful Content Detection [4.04666623219944]
Politically diverse language models require computational resources often inaccessible to many researchers and organizations.<n>Recent work has established that persona-based prompting can introduce political diversity in model outputs without additional training.<n>We investigate whether such prompting strategies can achieve results comparable to political pretraining for downstream tasks.
arXiv Detail & Related papers (2025-02-01T09:53:17Z)
Few-shot Policy (de)composition in Conversational Question Answering [54.259440408606515]
We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting.<n>We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies.<n>We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
arXiv Detail & Related papers (2025-01-20T08:40:15Z)
Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas [5.237116285113809]
We map the political distribution of persona-based prompted large language models using the Political Compass Test (PCT)<n>Our experiments reveal that synthetic personas predominantly cluster in the left-libertarian quadrant, with models demonstrating varying degrees of responsiveness when prompted with explicit ideological descriptors.<n>While all models demonstrate significant shifts towards right-authoritarian positions, they exhibit more limited shifts towards left-libertarian positions, suggesting an asymmetric response to ideological manipulation that may reflect inherent biases in model training.
arXiv Detail & Related papers (2024-12-19T13:36:18Z)
A Survey on Personalized Content Synthesis with Diffusion Models [53.79316736660402]
This paper introduces the general frameworks of PCS research, which can be categorized into test-time fine-tuning (TTF) and pre-trained adaptation (PTA) approaches.<n>We explore specialized tasks within the field, such as object, face, and style personalization, while highlighting their unique challenges and innovations.<n>Despite the promising progress, we also discuss ongoing challenges, including overfitting and the trade-off between subject fidelity and text alignment.
arXiv Detail & Related papers (2024-05-09T04:36:04Z)
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models. We show that models give substantively different answers when not forced. We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z)
The Political Preferences of LLMs [0.0]
I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs. Most conversational LLMs generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints. I demonstrate that LLMs can be steered towards specific locations in the political spectrum through Supervised Fine-Tuning.
arXiv Detail & Related papers (2024-02-02T02:43:10Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
AES Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models. Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models. We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z)
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient [62.24615324523435]
This paper provides a statistical analysis of high-dimensional batch Reinforcement Learning (RL) using sparse linear function approximation. When there is a large number of candidate features, our result sheds light on the fact that sparsity-aware methods can make batch RL more sample efficient.
arXiv Detail & Related papers (2020-11-08T16:48:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.