How Large Language Models play humans in online conversations: a simulated study of the 2016 US politics on Reddit
- URL: http://arxiv.org/abs/2506.21620v1
- Date: Mon, 23 Jun 2025 08:54:32 GMT
- Title: How Large Language Models play humans in online conversations: a simulated study of the 2016 US politics on Reddit
- Authors: Daniele Cirulli, Giulio Cimini, Giovanni Palermo,
- Abstract summary: Large Language Models (LLMs) have recently emerged as powerful tools for natural language generation.<n>We evaluate the performance of LLMs in replicating user-generated content within a real-world, divisive scenario: Reddit conversations during the 2016 US Presidential election.<n>We find GPT-4 is able to produce realistic comments, both in favor of or against the candidate supported by the community, yet tending to create consensus more easily than dissent.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have recently emerged as powerful tools for natural language generation, with applications spanning from content creation to social simulations. Their ability to mimic human interactions raises both opportunities and concerns, particularly in the context of politically relevant online discussions. In this study, we evaluate the performance of LLMs in replicating user-generated content within a real-world, divisive scenario: Reddit conversations during the 2016 US Presidential election. In particular, we conduct three different experiments, asking GPT-4 to generate comments by impersonating either real or artificial partisan users. We analyze the generated comments in terms of political alignment, sentiment, and linguistic features, comparing them against real user contributions and benchmarking against a null model. We find that GPT-4 is able to produce realistic comments, both in favor of or against the candidate supported by the community, yet tending to create consensus more easily than dissent. In addition we show that real and artificial comments are well separated in a semantically embedded space, although they are indistinguishable by manual inspection. Our findings provide insights on the potential use of LLMs to sneak into online discussions, influence political debate and shape political narratives, bearing broader implications of AI-driven discourse manipulation.
Related papers
- LLM-Based Bot Broadens the Range of Arguments in Online Discussions, Even When Transparently Disclosed as AI [5.393664305233901]
This study examines whether an LLM-based bot can widen the scope of perspectives expressed by participants in online discussions.<n>We evaluate the impact of a bot that actively monitors discussions, identifies missing arguments, and introduces them into the conversation.<n>The results indicate that our bot significantly expands the range of arguments, as measured by both objective and subjective metrics.
arXiv Detail & Related papers (2025-06-20T15:24:31Z) - Passing the Turing Test in Political Discourse: Fine-Tuning LLMs to Mimic Polarized Social Media Comments [0.0]
This study explores the extent to which fine-tuned large language models (LLMs) can replicate and amplify polarizing discourse.<n>Using a curated dataset of politically charged discussions extracted from Reddit, we fine-tune an open-source LLM to produce context-aware and ideologically aligned responses.<n>The results indicate that, when trained on partisan data, LLMs are capable of producing highly plausible and provocative comments, often indistinguishable from those written by humans.
arXiv Detail & Related papers (2025-06-17T15:41:26Z) - Incivility and Rigidity: The Risks of Fine-Tuning LLMs for Political Argumentation [11.255011967393838]
Incivility prevalent on platforms like Twitter (now X) and Reddit poses a challenge for developing AI systems.<n>In this study, we report experiments with GPT-3.5 Turbo, fine-tuned on two contrasting datasets of political discussions.<n>We show that Reddit-finetuned models produce safer but rhetorically rigid arguments, while cross-platform fine-tuning amplifies toxicity.
arXiv Detail & Related papers (2024-11-25T15:28:11Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models [33.251235538905895]
This paper introduces a novel approach to evaluating presidential debate performances using large language models.
We propose a framework that analyzes candidates' "Policies, Persona, and Perspective" (3P) and how they resonate with the "Interests, Ideologies, and Identity" (3I) of four key audience groups.
Our method employs large language models to generate the LLM-POTUS Score, a quantitative measure of debate performance.
arXiv Detail & Related papers (2024-09-12T15:40:45Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO [55.25989137825992]
We introduce ECHO, an evaluative framework inspired by the Turing test.
This framework engages the acquaintances of the target individuals to distinguish between human and machine-generated responses.
We evaluate three role-playing LLMs using ECHO, with GPT-3.5 and GPT-4 serving as foundational models.
arXiv Detail & Related papers (2024-04-22T08:00:51Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Demonstrations of the Potential of AI-based Political Issue Polling [0.0]
We develop a prompt engineering methodology for eliciting human-like survey responses from ChatGPT.
We execute large scale experiments, querying for thousands of simulated responses at a cost far lower than human surveys.
We find ChatGPT is effective at anticipating both the mean level and distribution of public opinion on a variety of policy issues.
But it is less successful at anticipating demographic-level differences.
arXiv Detail & Related papers (2023-07-10T12:17:15Z) - Natural Language Decompositions of Implicit Content Enable Better Text Representations [52.992875653864076]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.<n>We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.<n>Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.