Operational Validation of Large-Language-Model Agent Social Simulation: Evidence from Voat v/technology
- URL: http://arxiv.org/abs/2508.21740v1
- Date: Fri, 29 Aug 2025 16:06:27 GMT
- Title: Operational Validation of Large-Language-Model Agent Social Simulation: Evidence from Voat v/technology
- Authors: Aleksandar Tomašević, Darja Cvetković, Sara Major, Slobodan Maletić, Miroslav Anđelković, Ana Vranić, Boris Stupovski, Dušan Vudragović, Aleksandar Bogojević, Marija Mitrović Dankulov,
- Abstract summary: We build a technology community simulation modeled on Voat, a Reddit-like alt-right news aggregator and discussion platform active from 2014 to 2020.<n>Using the YSocial framework, we seed the simulation with a fixed catalog of technology links sampled from Voat's shared URLs.<n>Agents generate posts, replies, and reactions under platform rules for link and text submissions, threaded replies and daily activity cycles.<n>Results indicate familiar online regularities: similar activity rhythms, heavy-tailed participation, sparse low-clustering interaction networks, core-periphery structure, topical alignment with Voat, and elevated toxicity
- Score: 59.63189507373199
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) enable generative social simulations that can capture culturally informed, norm-guided interaction on online social platforms. We build a technology community simulation modeled on Voat, a Reddit-like alt-right news aggregator and discussion platform active from 2014 to 2020. Using the YSocial framework, we seed the simulation with a fixed catalog of technology links sampled from Voat's shared URLs (covering 30+ domains) and calibrate parameters to Voat's v/technology using samples from the MADOC dataset. Agents use a base, uncensored model (Dolphin 3.0, based on Llama 3.1 8B) and concise personas (demographics, political leaning, interests, education, toxicity propensity) to generate posts, replies, and reactions under platform rules for link and text submissions, threaded replies and daily activity cycles. We run a 30-day simulation and evaluate operational validity by comparing distributions and structures with matched Voat data: activity patterns, interaction networks, toxicity, and topic coverage. Results indicate familiar online regularities: similar activity rhythms, heavy-tailed participation, sparse low-clustering interaction networks, core-periphery structure, topical alignment with Voat, and elevated toxicity. Limitations of the current study include the stateless agent design and evaluation based on a single 30-day run, which constrains external validity and variance estimates. The simulation generates realistic discussions, often featuring toxic language, primarily centered on technology topics such as Big Tech and AI. This approach offers a valuable method for examining toxicity dynamics and testing moderation strategies within a controlled environment.
Related papers
- MASim: Multilingual Agent-Based Simulation for Social Science [68.04129327237963]
Multi-agent role-playing has recently shown promise for studying social behavior with language agents.<n>Existing simulations are mostly monolingual and fail to model cross-lingual interaction.<n>We introduce MASim, the first multilingual agent-based simulation framework.
arXiv Detail & Related papers (2025-12-08T06:12:48Z) - FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI [24.545163508739943]
FreeAskWorld is an interactive simulation framework that integrates large language models for high-level behavior planning and semantically grounded interaction.<n>Our framework supports scalable, realistic human-agent simulations and includes a modular data generation pipeline tailored for diverse embodied tasks.<n>We present and publicly release FreeAskWorld, a large-scale benchmark dataset comprising reconstructed environments, six diverse task types, 16 core object categories, 63,429 annotated sample frames, and more than 17 hours of interaction data.
arXiv Detail & Related papers (2025-11-17T15:58:46Z) - Simulating and Experimenting with Social Media Mobilization Using LLM Agents [7.262048441360133]
Building on the landmark 61-million-person Facebook experiment citepbond201261, we develop an agent-based simulation framework.<n>We integrate real U.S. Census demographic distributions, authentic Twitter network topology, and heterogeneous large language model (LLM) agents to examine the effect of mobilization messages on voter turnout.
arXiv Detail & Related papers (2025-10-30T13:43:28Z) - See, Think, Act: Online Shopper Behavior Simulation with VLM Agents [58.92444959954643]
This paper investigates the integration of visual information, specifically webpage screenshots, into behavior simulation via VLMs.<n>We employ SFT for joint action prediction and rationale generation, conditioning on the full interaction context.<n>To further enhance reasoning capabilities, we integrate RL with a hierarchical reward structure, scaled by a difficulty-aware factor.
arXiv Detail & Related papers (2025-10-22T05:07:14Z) - Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism [1.734165485480267]
We focus on replicating the behavior of social network users with the use of Large Language Models.<n>We empirically test different approaches to imitate user behavior on X in English and German.<n>Our findings suggest that social simulations should be validated by their empirical realism measured in the setting in which the simulation components were fitted.
arXiv Detail & Related papers (2025-06-27T07:32:16Z) - SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users [70.02370111025617]
We introduce SocioVerse, an agent-driven world model for social simulation.<n>Our framework features four powerful alignment components and a user pool of 10 million real individuals.<n>Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness.
arXiv Detail & Related papers (2025-04-14T12:12:52Z) - MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations [16.780177153401628]
We present a novel, open-source social network simulation framework, MOSAIC, where generative language agents predict user behaviors such as liking, sharing, and flagging content.<n>This simulation combines LLM agents with a directed social graph to analyze emergent deception behaviors and gain a better understanding of how users determine the veracity of online social content.
arXiv Detail & Related papers (2025-04-10T15:06:54Z) - Towards Online Multi-Modal Social Interaction Understanding [36.37278022436327]
We propose an online MMSI setting, where the model must resolve MMSI tasks using only historical information, such as recorded dialogues and video streams.<n>We develop a novel framework, named Online-MMSI-VLM, that leverages two complementary strategies: multi-party conversation forecasting and social-aware visual prompting.<n>Our method achieves state-of-the-art performance and significantly outperforms baseline models, indicating its effectiveness on Online-MMSI.
arXiv Detail & Related papers (2025-03-25T17:17:19Z) - Political Bias in LLMs: Unaligned Moral Values in Agent-centric Simulations [0.0]
We investigate how personalized language models align with human responses on the Moral Foundation Theory Questionnaire.<n>We adapt open-source generative language models to different political personas and repeatedly survey these models to generate synthetic data sets.<n>Our analysis reveals that models produce inconsistent results across multiple repetitions, yielding high response variance.
arXiv Detail & Related papers (2024-08-21T08:20:41Z) - Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation [43.46328146533669]
Social media has emerged as a cornerstone of social movements, wielding significant influence in driving societal change.
We introduce a hybrid framework HiSim for social media user simulation, wherein users are categorized into two types.
We construct a Twitter-like environment to replicate their response dynamics following trigger events.
arXiv Detail & Related papers (2024-02-26T06:28:54Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions.
We propose two knowledge-based data-driven methods to effectively capture these social interactions.
We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.