Related papers: Limits to Predicting Online Speech Using Large Language Models

Limits to Predicting Online Speech Using Large Language Models

URL: http://arxiv.org/abs/2407.12850v2
Date: Mon, 02 Dec 2024 15:46:35 GMT
Title: Limits to Predicting Online Speech Using Large Language Models
Authors: Mina Remeli, Moritz Hardt, Robert C. Williamson,
Abstract summary: Recent theoretical results suggest that posts from a user's social circle are as predictive of the user's future posts as that of the user's past posts.<n>We define predictability as a measure of the model's uncertainty, i.e., its negative log-likelihood on future tokens given context.<n>Across four large language models ranging in size from 1.5 billion to 70 billion parameters, we find that predicting a user's posts from their peers' posts performs poorly.
Score: 20.215414802169967
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study the predictability of online speech on social media, and whether predictability improves with information outside a user's own posts. Recent theoretical results suggest that posts from a user's social circle are as predictive of the user's future posts as that of the user's past posts. Motivated by the success of large language models, we empirically test this hypothesis. We define predictability as a measure of the model's uncertainty, i.e., its negative log-likelihood on future tokens given context. As the basis of our study, we collect 10M tweets for ``tweet-tuning'' base models and a further 6.25M posts from more than five thousand X (previously Twitter) users and their peers. Across four large language models ranging in size from 1.5 billion to 70 billion parameters, we find that predicting a user's posts from their peers' posts performs poorly. Moreover, the value of the user's own posts for prediction is consistently higher than that of their peers'. We extend our investigation with a detailed analysis on what's learned in-context and the robustness of our findings. From context, base models learn to correctly predict @-mentions and hashtags. Moreover, our results replicate if instead of prompting the model with additional context, we finetune on it. Across the board, we find that predicting the posts of individual users remains hard.

Related papers

Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models [49.41113560646115]
We investigate various proxy measures of bias in large language models (LLMs)<n>We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores.<n>With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle.
arXiv Detail & Related papers (2025-06-12T08:47:40Z)
Will I Get Hate Speech Predicting the Volume of Abusive Replies before Posting in Social Media [0.0]
We look at four types of features, namely text, text metadata, tweet metadata, and account features. This helps us understand the extent to which the user or the content helps predict the number of abusive replies. One of our objectives is to determine the extent to which the volume of abusive replies that a tweet will get are motivated by the content of the tweet or by the identity of the user posting it.
arXiv Detail & Related papers (2025-03-04T21:04:21Z)
Eliciting Uncertainty in Chain-of-Thought to Mitigate Bias against Forecasting Harmful User Behaviors [29.892041865029803]
Conversation forecasting tasks a model with predicting the outcome of an unfolding conversation. It can be applied in social media moderation to predict harmful user behaviors before they occur. This paper explores what extent model uncertainty can be used as a tool to mitigate potential biases.
arXiv Detail & Related papers (2024-10-17T15:07:53Z)
Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering [8.20929362102942]
Author profiling is the task of inferring characteristics about individuals by analyzing content they share. We propose a new method for author profiling which aims at distinguishing relevant from irrelevant content first, followed by the actual user profiling only with relevant data. We evaluate our method for Big Five personality trait prediction on two Twitter corpora.
arXiv Detail & Related papers (2024-09-06T08:43:10Z)
Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy. As predictions shape collective outcomes, social welfare arises naturally as a metric of concern. We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z)
Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"? [15.773775387121097]
We show that calibration of large language models typically improves with model size. We find that temperature-scaling probabilities lead to a systematically better fit to reading times.
arXiv Detail & Related papers (2023-11-15T19:34:06Z)
Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics. Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z)
Humans and language models diverge when predicting repeating text [52.03471802608112]
We present a scenario in which the performance of humans and LMs diverges. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory begins to play a role. We hope that this scenario will spur future work in bringing LMs closer to human behavior.
arXiv Detail & Related papers (2023-10-10T08:24:28Z)
Context-Based Tweet Engagement Prediction [0.0]
This thesis investigates how well context alone may be used to predict tweet engagement likelihood. We employed the Spark engine on TU Wien's Little Big Data Cluster to create scalable data preprocessing, feature engineering, feature selection, and machine learning pipelines. We also found that factors such as the prediction algorithm, training dataset size, training dataset sampling method, and feature selection significantly affect the results.
arXiv Detail & Related papers (2023-09-28T08:36:57Z)
Measuring the Effect of Influential Messages on Varying Personas [67.1149173905004]
We present a new task, Response Forecasting on Personas for News Media, to estimate the response a persona might have upon seeing a news message. The proposed task not only introduces personalization in the modeling but also predicts the sentiment polarity and intensity of each response. This enables more accurate and comprehensive inference on the mental state of the persona.
arXiv Detail & Related papers (2023-05-25T21:01:00Z)
Design and analysis of tweet-based election models for the 2021 Mexican legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z)
You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory Prediction [52.442129609979794]
Recent deep learning approaches for trajectory prediction show promising performance. It remains unclear which features such black-box models actually learn to use for making predictions. This paper proposes a procedure that quantifies the contributions of different cues to model performance.
arXiv Detail & Related papers (2021-10-11T14:24:15Z)
Predicting the Popularity of Reddit Posts with AI [0.30458514384586405]
This study aims to develop a machine learning model capable of accurately predicting the popularity of a Reddit post. Specifically, the model is predicting the number of upvotes a post will receive based on its textual content. I collected Reddit post data from an online data set and analyzed the model's performance when trained on a single subreddit and a collection of subreddits.
arXiv Detail & Related papers (2021-06-08T23:30:25Z)
Adversarial Generative Grammars for Human Activity Prediction [141.43526239537502]
We propose an adversarial generative grammar model for future prediction. Our grammar is designed so that it can learn production rules from the data distribution. Being able to select multiple production rules during inference leads to different predicted outcomes.
arXiv Detail & Related papers (2020-08-11T17:47:53Z)
Explainable Depression Detection with Multi-Modalities Using a Hybrid Deep Learning Model on Social Media [21.619614611039257]
We propose interpretive Multi-Modal Depression Detection with Hierarchical Attention Network MDHAN. Our model helps improve predictive performance when detecting depression in users who are posting messages publicly on social media.
arXiv Detail & Related papers (2020-07-03T12:11:22Z)
Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data. We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.