Related papers: The Burden of Interactive Alignment with Inconsistent Preferences

The Burden of Interactive Alignment with Inconsistent Preferences

URL: http://arxiv.org/abs/2510.16368v1
Date: Sat, 18 Oct 2025 06:25:57 GMT
Title: The Burden of Interactive Alignment with Inconsistent Preferences
Authors: Ali Shirali,
Abstract summary: We show how users with inconsistent preferences can align an engagement-driven algorithm with their interests in a Stackelberg equilibrium.<n>We show that users who are sufficiently foresighted can achieve alignment, while those who are not are instead aligned to the algorithm's objective.
Score: 5.499453986105878
License: http://creativecommons.org/licenses/by/4.0/
Abstract: From media platforms to chatbots, algorithms shape how people interact, learn, and discover information. Such interactions between users and an algorithm often unfold over multiple steps, during which strategic users can guide the algorithm to better align with their true interests by selectively engaging with content. However, users frequently exhibit inconsistent preferences: they may spend considerable time on content that offers little long-term value, inadvertently signaling that such content is desirable. Focusing on the user side, this raises a key question: what does it take for such users to align the algorithm with their true interests? To investigate these dynamics, we model the user's decision process as split between a rational system 2 that decides whether to engage and an impulsive system 1 that determines how long engagement lasts. We then study a multi-leader, single-follower extensive Stackelberg game, where users, specifically system 2, lead by committing to engagement strategies and the algorithm best-responds based on observed interactions. We define the burden of alignment as the minimum horizon over which users must optimize to effectively steer the algorithm. We show that a critical horizon exists: users who are sufficiently foresighted can achieve alignment, while those who are not are instead aligned to the algorithm's objective. This critical horizon can be long, imposing a substantial burden. However, even a small, costly signal (e.g., an extra click) can significantly reduce it. Overall, our framework explains how users with inconsistent preferences can align an engagement-driven algorithm with their interests in a Stackelberg equilibrium, highlighting both the challenges and potential remedies for achieving alignment.

Related papers

Beyond Match Maximization and Fairness: Retention-Optimized Two-Sided Matching [22.731829414580847]
We introduce a dynamic learning-to-rank algorithm called Matching for Retention (MRet)<n>Unlike conventional algorithms for two-sided matching, our approach models user retention by learning retention curves from each user's profile and interaction history.<n>MRet achieves higher user retention, since conventional methods optimize matches or fairness rather than retention.
arXiv Detail & Related papers (2026-02-17T17:30:53Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content [66.71102704873185]
We test for user strategization by conducting a lab experiment and survey. We find strong evidence of strategization across outcome metrics, including participants' dwell time and use of "likes" Our findings suggest that platforms cannot ignore the effect of their algorithms on user behavior.
arXiv Detail & Related papers (2024-05-09T07:36:08Z)
Can Probabilistic Feedback Drive User Impacts in Online Platforms? [26.052963782865294]
A common explanation for negative user impacts of content recommender systems is misalignment between the platform's objective and user welfare. In this work, we show that misalignment in the platform's objective is not the only potential cause of unintended impacts on users. The source of these user impacts is that different pieces of content may generate observable user reactions (feedback information) at different rates.
arXiv Detail & Related papers (2024-01-10T18:12:31Z)
Matching of Users and Creators in Two-Sided Markets with Departures [0.6649753747542209]
We propose a model of content recommendation that focuses on the dynamics of user-content matching. We show that a user-centric greedy algorithm that does not consider creator departures can result in arbitrarily poor total engagement. We present two practical algorithms, one with performance guarantees under mild assumptions on user preferences, and another that tends to outperform algorithms that ignore two-sided departures in practice.
arXiv Detail & Related papers (2023-12-30T20:13:28Z)
User Strategization and Trustworthy Algorithms [81.82279667028423]
We show that user strategization can actually help platforms in the short term. We then show that it corrupts platforms' data and ultimately hurts their ability to make counterfactual decisions.
arXiv Detail & Related papers (2023-12-29T16:09:42Z)
Engagement, User Satisfaction, and the Amplification of Divisive Content on Social Media [22.206581957044513]
We find that Twitter's engagement-based ranking algorithm amplifies emotionally charged, out-group hostile content.<n>We explore the implications of an alternative approach that ranks content based on users' stated preferences.
arXiv Detail & Related papers (2023-05-26T13:57:30Z)
Modeling Content Creator Incentives on Algorithm-Curated Platforms [76.53541575455978]
We study how algorithmic choices affect the existence and character of (Nash) equilibria in exposure games. We propose tools for numerically finding equilibria in exposure games, and illustrate results of an audit on the MovieLens and LastFM datasets.
arXiv Detail & Related papers (2022-06-27T08:16:59Z)
Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system. Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations. While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z)
Online Learning Demands in Max-min Fairness [91.37280766977923]
We describe mechanisms for the allocation of a scarce resource among multiple users in a way that is efficient, fair, and strategy-proof. The mechanism is repeated for multiple rounds and a user's requirements can change on each round. At the end of each round, users provide feedback about the allocation they received, enabling the mechanism to learn user preferences over time.
arXiv Detail & Related papers (2020-12-15T22:15:20Z)
Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback. We devise an algorithm with a minimal cluster recovery error rate. For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.