Gender and Political Bias in Large Language Models: A Demonstration Platform
- URL: http://arxiv.org/abs/2509.16264v2
- Date: Tue, 23 Sep 2025 03:43:30 GMT
- Title: Gender and Political Bias in Large Language Models: A Demonstration Platform
- Authors: Wenjie Lin, Hange Liu, Xutao Mao, Yingying Zhuang, Jingwei Shi, Xudong Han, Tianyu Shi, Jinrui Yang,
- Abstract summary: ParlAI Vote is an interactive system for exploring European Parliament debates and votes.<n>It includes rich demographic data such as gender, age, country, and political group.<n>Users can browse debates, inspect linked speeches, compare real voting outcomes with predictions from frontier LLMs.
- Score: 12.223144746389371
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present ParlAI Vote, an interactive system for exploring European Parliament debates and votes, and for testing LLMs on vote prediction and bias analysis. This platform connects debate topics, speeches, and roll-call outcomes, and includes rich demographic data such as gender, age, country, and political group. Users can browse debates, inspect linked speeches, compare real voting outcomes with predictions from frontier LLMs, and view error breakdowns by demographic group. Visualizing the EuroParlVote benchmark and its core tasks of gender classification and vote prediction, ParlAI Vote highlights systematic performance bias in state-of-the-art LLMs. The system unifies data, models, and visual analytics in a single interface, lowering the barrier for reproducing findings, auditing behavior, and running counterfactual scenarios. It supports research, education, and public engagement with legislative decision-making, while making clear both the strengths and the limitations of current LLMs in political analysis.
Related papers
- Uncovering Political Bias in Large Language Models using Parliamentary Voting Records [2.272052150526026]
This paper introduces a general methodology for constructing political bias benchmarks.<n>We instantiate this methodology in three national case studies.<n>We assess ideological tendencies and political entity bias in LLM behavior.
arXiv Detail & Related papers (2026-01-13T18:18:25Z) - Latent Topic Synthesis: Leveraging LLMs for Electoral Ad Analysis [51.95395936342771]
We introduce an end-to-end framework for automatically generating an interpretable topic taxonomy from an unlabeled corpus.<n>We apply this framework to a large corpus of Meta political ads from the month ahead of the 2024 U.S. Presidential election.<n>Our approach uncovers latent discourse structures, synthesizes semantically rich topic labels, and annotates topics with moral framing dimensions.
arXiv Detail & Related papers (2025-10-16T20:30:20Z) - Benchmarking Gender and Political Bias in Large Language Models [37.192287982246526]
We introduce EuroParlVote, a novel benchmark for evaluating large language models (LLMs) in politically sensitive contexts.<n>It links European Parliament debate speeches to roll-call vote outcomes and includes rich demographic metadata for each Member of the European Parliament (MEP)<n>Using EuroParlVote, we evaluate state-of-the-art LLMs on two tasks -- gender classification and vote prediction -- revealing consistent patterns of bias.
arXiv Detail & Related papers (2025-09-07T18:23:30Z) - KOKKAI DOC: An LLM-driven framework for scaling parliamentary representatives [0.0]
This paper introduces an LLM-driven framework designed to accurately scale the political issue stances of parliamentary representatives.<n>By leveraging advanced natural language processing techniques and large language models, the proposed methodology refines and enhances previous approaches.<n>The framework incorporates three major innovations: (1) de-noising parliamentary speeches via summarization to produce cleaner, more consistent opinion embeddings; (2) automatic extraction of axes of political controversy from legislators' speech summaries; and (3) a diachronic analysis that tracks the evolution of party positions over time.
arXiv Detail & Related papers (2025-05-11T21:03:53Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.<n>Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents [70.17229548653852]
We introduce ElectionSim, an innovative election simulation framework based on large language models.
We present a million-level voter pool sampled from social media platforms to support accurate individual simulation.
We also introduce PPE, a poll-based presidential election benchmark to assess the performance of our framework under the U.S. presidential election scenario.
arXiv Detail & Related papers (2024-10-28T05:25:50Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Towards More Accurate US Presidential Election via Multi-step Reasoning with Large Language Models [12.582222782098587]
Election prediction poses unique challenges, such as limited voter-level data, rapidly changing political landscapes, and the need to model complex human behavior.<n>We introduce a multi-step reasoning framework designed for political analysis.<n>Our approach is validated on real-world data from the American National Election Studies (ANES) 2016 and 2020.
arXiv Detail & Related papers (2024-10-21T06:18:53Z) - LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models [33.251235538905895]
This paper introduces a novel approach to evaluating presidential debate performances using large language models.
We propose a framework that analyzes candidates' "Policies, Persona, and Perspective" (3P) and how they resonate with the "Interests, Ideologies, and Identity" (3I) of four key audience groups.
Our method employs large language models to generate the LLM-POTUS Score, a quantitative measure of debate performance.
arXiv Detail & Related papers (2024-09-12T15:40:45Z) - United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections [42.72938925647165]
"Synthetic samples" based on large language models (LLMs) have been argued to serve as efficient alternatives to surveys of humans.<n>"Synthetic samples" might exhibit bias due to training data and fine-tuning processes being unrepresentative of diverse contexts.<n>This study investigates if and under which conditions LLM-generated synthetic samples can be used for public opinion prediction.
arXiv Detail & Related papers (2024-08-29T16:01:06Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.