Related papers: Benchmarking LLMs for Political Science: A United Nations Perspective

Benchmarking LLMs for Political Science: A United Nations Perspective

URL: http://arxiv.org/abs/2502.14122v1
Date: Wed, 19 Feb 2025 21:51:01 GMT
Title: Benchmarking LLMs for Political Science: A United Nations Perspective
Authors: Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu,
Abstract summary: Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored.<n>This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process.<n>We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 to 2024, including draft resolutions, voting records, and diplomatic speeches.
Score: 34.000742556609126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process, where the stakes are particularly high and political decisions can have far-reaching consequences. We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 to 2024, including draft resolutions, voting records, and diplomatic speeches. Using this dataset, we propose the United Nations Benchmark (UNBench), the first comprehensive benchmark designed to evaluate LLMs across four interconnected political science tasks: co-penholder judgment, representative voting simulation, draft adoption prediction, and representative statement generation. These tasks span the three stages of the UN decision-making process--drafting, voting, and discussing--and aim to assess LLMs' ability to understand and simulate political dynamics. Our experimental analysis demonstrates the potential and challenges of applying LLMs in this domain, providing insights into their strengths and limitations in political science. This work contributes to the growing intersection of AI and political science, opening new avenues for research and practical applications in global governance. The UNBench Repository can be accessed at: https://github.com/yueqingliang1/UNBench.

Related papers

Large Language Models in Legislative Content Analysis: A Dataset from the Polish Parliament [0.0]
The research contributes to the advancement of NLP in the legal field, particularly in the Polish language. It has been demonstrated that even commonly accessible data can be practically utilized for legislative content analysis.
arXiv Detail & Related papers (2025-03-15T12:10:20Z)
A Large-Scale Simulation on Large Language Models for Decision-Making in Political Science [18.521101885334673]
We develop a theory-driven, multi-step reasoning framework to simulate voter decision-making at scale. We conduct large-scale simulations of recent U.S. presidential elections using synthetic personas calibrated to real-world voter data.
arXiv Detail & Related papers (2024-12-19T07:10:51Z)
Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach [14.32199539218175]
This paper proposes an adaptable Large Language Model (LLM)-driven online testing framework to explore critical and diverse testing scenarios.<n>Specifically, we design a "generate-test-feedback" pipeline with templated prompt engineering to harness the world knowledge and reasoning abilities of LLMs.
arXiv Detail & Related papers (2024-12-09T17:27:04Z)
Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.<n>Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z)
Large Language Models in Politics and Democracy: A Comprehensive Survey [0.0]
Large language models (LLMs) offer potential across various domains, including policymaking, political communication, analysis, and governance.<n>LLMs offer opportunities to enhance efficiency, inclusivity, and decision-making in political processes.<n>They also present challenges related to bias, transparency, and accountability.
arXiv Detail & Related papers (2024-12-01T15:23:34Z)
Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models [33.251235538905895]
This paper introduces a novel approach to evaluating presidential debate performances using large language models. We propose a framework that analyzes candidates' "Policies, Persona, and Perspective" (3P) and how they resonate with the "Interests, Ideologies, and Identity" (3I) of four key audience groups. Our method employs large language models to generate the LLM-POTUS Score, a quantitative measure of debate performance.
arXiv Detail & Related papers (2024-09-12T15:40:45Z)
A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law. We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies. We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z)
Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions? [59.0123596591807]
We benchmark the ability of Large Language Models (LLMs) in persona-driven decision-making. We investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains.
arXiv Detail & Related papers (2024-04-18T12:40:59Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models [78.08792285698853]
We present a large-scale empirical study on general language ability evaluation of pretrained language models (ElitePLM) Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; and (3) PLMs have excellent transferability between similar tasks.
arXiv Detail & Related papers (2022-05-03T14:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.