EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding
- URL: http://arxiv.org/abs/2505.19558v1
- Date: Mon, 26 May 2025 06:21:16 GMT
- Title: EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding
- Authors: Zhaowei Zhang, Minghua Yi, Mengmeng Wang, Fengshuo Bai, Zilong Zheng, Yipeng Kang, Yaodong Yang,
- Abstract summary: We introduce EuroCon, a novel benchmark constructed from 2,225 high-quality deliberation records of the European Parliament over 13 years.<n>Specifically, EuroCon incorporates four factors to build each simulated parliament setting: specific political issues, political goals, participating parties, and power structures.<n>We show that even state-of-the-art models remain undersatisfied with complex tasks like passing resolutions by a two-thirds majority.
- Score: 30.353539900597674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Achieving political consensus is crucial yet challenging for the effective functioning of social governance. However, although frontier AI systems represented by large language models (LLMs) have developed rapidly in recent years, their capabilities on this scope are still understudied. In this paper, we introduce EuroCon, a novel benchmark constructed from 2,225 high-quality deliberation records of the European Parliament over 13 years, ranging from 2009 to 2022, to evaluate the ability of LLMs to reach political consensus among divergent party positions across diverse parliament settings. Specifically, EuroCon incorporates four factors to build each simulated parliament setting: specific political issues, political goals, participating parties, and power structures based on seat distribution. We also develop an evaluation framework for EuroCon to simulate real voting outcomes in different parliament settings, assessing whether LLM-generated resolutions meet predefined political goals. Our experimental results demonstrate that even state-of-the-art models remain undersatisfied with complex tasks like passing resolutions by a two-thirds majority and addressing security issues, while revealing some common strategies LLMs use to find consensus under different power structures, such as prioritizing the stance of the dominant party, highlighting EuroCon's promise as an effective platform for studying LLMs' ability to find political consensus.
Related papers
- HatePRISM: Policies, Platforms, and Research Integration. Advancing NLP for Hate Speech Proactive Mitigation [67.69631485036665]
We conduct a comprehensive examination of hate speech regulations and strategies from three perspectives.<n>Our findings reveal significant inconsistencies in hate speech definitions and moderation practices across jurisdictions.<n>We suggest ideas and research direction for further exploration of a unified framework for automated hate speech moderation.
arXiv Detail & Related papers (2025-07-06T11:25:23Z) - KOKKAI DOC: An LLM-driven framework for scaling parliamentary representatives [0.0]
This paper introduces an LLM-driven framework designed to accurately scale the political issue stances of parliamentary representatives.<n>By leveraging advanced natural language processing techniques and large language models, the proposed methodology refines and enhances previous approaches.<n>The framework incorporates three major innovations: (1) de-noising parliamentary speeches via summarization to produce cleaner, more consistent opinion embeddings; (2) automatic extraction of axes of political controversy from legislators' speech summaries; and (3) a diachronic analysis that tracks the evolution of party positions over time.
arXiv Detail & Related papers (2025-05-11T21:03:53Z) - Benchmarking LLMs for Political Science: A United Nations Perspective [34.000742556609126]
Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored.<n>This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process.<n>We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 to 2024, including draft resolutions, voting records, and diplomatic speeches.
arXiv Detail & Related papers (2025-02-19T21:51:01Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.<n>Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - Aligning AI with Public Values: Deliberation and Decision-Making for Governing Multimodal LLMs in Political Video Analysis [48.14390493099495]
How AI models should deal with political topics has been discussed, but it remains challenging and requires better governance.<n>This paper examines the governance of large language models through individual and collective deliberation, focusing on politically sensitive videos.
arXiv Detail & Related papers (2024-09-15T03:17:38Z) - LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models [33.251235538905895]
This paper introduces a novel approach to evaluating presidential debate performances using large language models.
We propose a framework that analyzes candidates' "Policies, Persona, and Perspective" (3P) and how they resonate with the "Interests, Ideologies, and Identity" (3I) of four key audience groups.
Our method employs large language models to generate the LLM-POTUS Score, a quantitative measure of debate performance.
arXiv Detail & Related papers (2024-09-12T15:40:45Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs [18.836470390824633]
We audit Llama Chat in the context of EU politics to analyze the model's political knowledge and its ability to reason in context.
We adapt, i.e., further fine-tune, Llama Chat on speeches of individual euro-parties from debates in the European Parliament to reevaluate its political leaning.
arXiv Detail & Related papers (2024-03-20T13:42:57Z) - Modelling Political Coalition Negotiations Using LLM-based Agents [53.934372246390495]
We introduce coalition negotiations as a novel NLP task, and model it as a negotiation between large language model-based agents.
We introduce a multilingual dataset, POLCA, comprising manifestos of European political parties and coalition agreements over a number of elections in these countries.
We propose a hierarchical Markov decision process designed to simulate the process of coalition negotiation between political parties and predict the outcomes.
arXiv Detail & Related papers (2024-02-18T21:28:06Z) - Generalizing Political Leaning Inference to Multi-Party Systems:
Insights from the UK Political Landscape [10.798766768721741]
An ability to infer the political leaning of social media users can help in gathering opinion polls.
We release a dataset comprising users labelled by their political leaning as well as interactions with one another.
We show that interactions in the form of retweets between users can be a very powerful feature to enable political leaning inference.
arXiv Detail & Related papers (2023-12-04T09:02:17Z) - P^3SUM: Preserving Author's Perspective in News Summarization with Diffusion Language Models [57.571395694391654]
We find that existing approaches alter the political opinions and stances of news articles in more than 50% of summaries.
We propose P3SUM, a diffusion model-based summarization approach controlled by political perspective classifiers.
Experiments on three news summarization datasets demonstrate that P3SUM outperforms state-of-the-art summarization systems.
arXiv Detail & Related papers (2023-11-16T10:14:28Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [98.18244218156492]
Large Language Models (LLMs) have significantly advanced natural language processing.<n>As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework.<n>This work introduces a novel competition-based benchmark framework to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - The ParlaSent Multilingual Training Dataset for Sentiment Identification in Parliamentary Proceedings [0.0]
The paper presents a new training dataset of sentences in 7 languages, manually annotated for sentiment.
The paper additionally introduces the first domain-specific multilingual transformer language model for political science applications.
arXiv Detail & Related papers (2023-09-18T14:01:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.