Related papers: Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky

Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky

URL: http://arxiv.org/abs/2511.17241v1
Date: Fri, 21 Nov 2025 13:40:14 GMT
Title: Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky
Authors: Benjamin White, Anastasia Shimorina,
Abstract summary: This paper presents a hybrid methodology for social media user behavior prediction.<n>It addresses both frequent and infrequent actions across a diverse action vocabulary.<n>Our approach achieved first place in the SocialSim: Social-Media Based Personas challenge.
Score: 0.7305019142196582
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Understanding and predicting user behavior on social media platforms is crucial for content recommendation and platform design. While existing approaches focus primarily on common actions like retweeting and liking, the prediction of rare but significant behaviors remains largely unexplored. This paper presents a hybrid methodology for social media user behavior prediction that addresses both frequent and infrequent actions across a diverse action vocabulary. We evaluate our approach on a large-scale Bluesky dataset containing 6.4 million conversation threads spanning 12 distinct user actions across 25 persona clusters. Our methodology combines four complementary approaches: (i) a lookup database system based on historical response patterns; (ii) persona-specific LightGBM models with engineered temporal and semantic features for common actions; (iii) a specialized hybrid neural architecture fusing textual and temporal representations for rare action classification; and (iv) generation of text replies. Our persona-specific models achieve an average macro F1-score of 0.64 for common action prediction, while our rare action classifier achieves 0.56 macro F1-score across 10 rare actions. These results demonstrate that effective social media behavior prediction requires tailored modeling strategies recognizing fundamental differences between action types. Our approach achieved first place in the SocialSim: Social-Media Based Personas challenge organized at the Social Simulation with LLMs workshop at COLM 2025.

Related papers

HumanLLM: Towards Personalized Understanding and Simulation of Human Nature [72.55730315685837]
HumanLLM is a foundation model designed for personalized understanding and simulation of individuals.<n>We first construct the Cognitive Genome, a large-scale corpus curated from real-world user data on platforms like Reddit, Twitter, Blogger, and Amazon.<n>We then formulate diverse learning tasks and perform supervised fine-tuning to empower the model to predict a wide range of individualized human behaviors, thoughts, and experiences.
arXiv Detail & Related papers (2026-01-22T09:27:27Z)
$\texttt{BluePrint}$: A Social Media User Dataset for LLM Persona Evaluation and Training [8.563967699751684]
Large language models (LLMs) offer promising capabilities for social media dynamics at scale.<n>We introduce S, a framework for constructing behaviorally-grounded social media suitable for training agent models.<n>We release BluePrint, a large-scale dataset built from public Bluesky data focused on political discourse.
arXiv Detail & Related papers (2025-09-27T06:02:38Z)
Population-Aligned Persona Generation for LLM-based Social Simulation [58.84363795421489]
We propose a systematic framework for synthesizing high-quality, population-aligned persona sets for social simulation.<n>Our approach begins by leveraging large language models to generate narrative personas from long-term social media data.<n>To address the needs of specific simulation contexts, we introduce a task-specific module that adapts the globally aligned persona set to targeted subpopulations.
arXiv Detail & Related papers (2025-09-12T10:43:47Z)
MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework [53.82097200295448]
Mean-Field LLM (MF-LLM) is first to incorporate mean field theory into social simulation.<n>MF-LLM models bidirectional interactions between individuals and the population through an iterative process.<n> IB-Tune is a novel fine-tuning method inspired by the Information Bottleneck principle.
arXiv Detail & Related papers (2025-04-30T12:41:51Z)
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users [70.02370111025617]
We introduce SocioVerse, an agent-driven world model for social simulation.<n>Our framework features four powerful alignment components and a user pool of 10 million real individuals.<n>Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness.
arXiv Detail & Related papers (2025-04-14T12:12:52Z)
Social Processes: Probabilistic Meta-learning for Adaptive Multiparty Interaction Forecasting [3.9134031118910264]
We introduce Social Process (SP) models, which predict a distribution over future multimodal cues jointly for all group members.<n>We also analyze the generalization capabilities of SP models in both their outputs and latent spaces through the use of realistic synthetic datasets.
arXiv Detail & Related papers (2025-01-03T17:34:53Z)
Social Media Use is Predictable from App Sequences: Using LSTM and Transformer Neural Networks to Model Habitual Behavior [0.11086185608421924]
The present paper introduces a novel approach to studying social media habits through predictive modeling of sequential smartphone user behaviors. We show that (i) social media use is predictable at the within and between-person level and that (ii) there are robust individual differences in the predictability of social media use.
arXiv Detail & Related papers (2024-04-20T16:36:28Z)
Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation [43.46328146533669]
Social media has emerged as a cornerstone of social movements, wielding significant influence in driving societal change. We introduce a hybrid framework HiSim for social media user simulation, wherein users are categorized into two types. We construct a Twitter-like environment to replicate their response dynamics following trigger events.
arXiv Detail & Related papers (2024-02-26T06:28:54Z)
Detecting value-expressive text posts in Russian social media [0.0]
We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte.<n>A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT.<n>ChatGPT was more consistent but struggled with spam detection.
arXiv Detail & Related papers (2023-12-14T14:18:27Z)
Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics. Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z)
Incorporating Heterogeneous User Behaviors and Social Influences for Predictive Analysis [32.31161268928372]
We aim to incorporate heterogeneous user behaviors and social influences for behavior predictions. This paper proposes a variant of Long-Short Term Memory (LSTM) which can consider context while a behavior sequence. A residual learning-based decoder is designed to automatically construct multiple high-order cross features based on social behavior representation.
arXiv Detail & Related papers (2022-07-24T17:05:37Z)
Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions. We propose two knowledge-based data-driven methods to effectively capture these social interactions. We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.