Related papers: Look-Ahead Reasoning on Learning Platforms

Look-Ahead Reasoning on Learning Platforms

URL: http://arxiv.org/abs/2511.14745v1
Date: Tue, 18 Nov 2025 18:45:32 GMT
Title: Look-Ahead Reasoning on Learning Platforms
Authors: Haiqing Zhu, Tijana Zrnic, Celestine Mendler-Dünner,
Abstract summary: We show that look-ahead reasoning takes into account that user actions are coupled, and -- at scale -- impact future predictions.<n>We first formalize level-$k$ thinking, a concept from behavioral economics, where users aim to outsmart their peers by looking one step ahead.<n>Then, we focus on collective reasoning, where users take coordinated actions by optimizing through their joint impact on the model.
Score: 24.349139933086132
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: On many learning platforms, the optimization criteria guiding model training reflect the priorities of the designer rather than those of the individuals they affect. Consequently, users may act strategically to obtain more favorable outcomes, effectively contesting the platform's predictions. While past work has studied strategic user behavior on learning platforms, the focus has largely been on strategic responses to a deployed model, without considering the behavior of other users. In contrast, look-ahead reasoning takes into account that user actions are coupled, and -- at scale -- impact future predictions. Within this framework, we first formalize level-$k$ thinking, a concept from behavioral economics, where users aim to outsmart their peers by looking one step ahead. We show that, while convergence to an equilibrium is accelerated, the equilibrium remains the same, providing no benefit of higher-level reasoning for individuals in the long run. Then, we focus on collective reasoning, where users take coordinated actions by optimizing through their joint impact on the model. By contrasting collective with selfish behavior, we characterize the benefits and limits of coordination; a new notion of alignment between the learner's and the users' utilities emerges as a key concept. We discuss connections to several related mathematical frameworks, including strategic classification, performative prediction, and algorithmic collective action.

Related papers

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning [104.63494870852894]
Reinforcement learning (RL) has shown promise in training agentic models that engage in dynamic, multi-turn interactions.<n>We propose UserRL, a unified framework for training and evaluating user-centric abilities through standardized gym environments.
arXiv Detail & Related papers (2025-09-24T03:33:20Z)
Classification Under Strategic Self-Selection [13.168262355330299]
We study the effects of self-selection on learning and the implications of learning on the composition of the self-selected population. We propose a differentiable framework for learning under self-selective behavior, which can be optimized effectively.
arXiv Detail & Related papers (2024-02-23T11:37:56Z)
Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning. We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures. We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z)
Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics. Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z)
MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes. This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people. We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z)
A Framework for Understanding Model Extraction Attack and Defense [48.421636548746704]
We study tradeoffs between model utility from a benign user's view and privacy from an adversary's view. We develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies.
arXiv Detail & Related papers (2022-06-23T05:24:52Z)
Strategic Classification with Graph Neural Networks [10.131895986034316]
Using a graph for learning introduces inter-user dependencies in prediction. We propose a differentiable framework for strategically-robust learning of graph-based classifiers.
arXiv Detail & Related papers (2022-05-31T13:11:25Z)
Generalized Strategic Classification and the Case of Aligned Incentives [16.607142366834015]
We argue for a broader perspective on what accounts for strategic user behavior. Our model subsumes most current models, but includes other novel settings. We show how our results and approach can extend to the most general case.
arXiv Detail & Related papers (2022-02-09T09:36:09Z)
Learning from Heterogeneous Data Based on Social Interactions over Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions. We show that the. strategy enables the agents to learn consistently under this highly-heterogeneous setting. We show that the. strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z)
Reinforcement Learning Beyond Expectation [11.428014000851535]
Cumulative prospect theory (CPT) is a paradigm that has been empirically shown to model a tendency of humans to view gains and losses differently. In this paper, we consider a setting where an autonomous agent has to learn behaviors in an unknown environment. In order to endow the agent with the ability to closely mimic the behavior of human users, we optimize a CPT-based cost.
arXiv Detail & Related papers (2021-03-29T20:35:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.