Related papers: Modelling the Recommender Alignment Problem

Modelling the Recommender Alignment Problem

URL: http://arxiv.org/abs/2208.12299v1
Date: Thu, 25 Aug 2022 18:37:49 GMT
Title: Modelling the Recommender Alignment Problem
Authors: Francisco Carvalho
Abstract summary: This work aims to shed light on how an end-to-end study of reward functions for recommender systems might be done. We learn recommender policies that optimize reward functions by controlling graph dynamics on a toy environment. Based on the effects that trained recommenders have on their environment, we conclude that engagement maximizers generally lead to worse outcomes than aligned recommenders but not always.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recommender systems (RS) mediate human experience online. Most RS act to optimize metrics that are imperfectly aligned with the best-interest of users but are easy to measure, like ad-clicks and user engagement. This has resulted in a host of hard-to-measure side-effects: political polarization, addiction, fake news. RS design faces a recommender alignment problem: that of aligning recommendations with the goals of users, system designers, and society as a whole. But how do we test and compare potential solutions to align RS? Their massive scale makes them costly and risky to test in deployment. We synthesized a simple abstract modelling framework to guide future work. To illustrate it, we construct a toy experiment where we ask: "How can we evaluate the consequences of using user retention as a reward function?" To answer the question, we learn recommender policies that optimize reward functions by controlling graph dynamics on a toy environment. Based on the effects that trained recommenders have on their environment, we conclude that engagement maximizers generally lead to worse outcomes than aligned recommenders but not always. After learning, we examine competition between RS as a potential solution to RS alignment. We find that it generally makes our toy-society better-off than it would be under the absence of recommendation or engagement maximizers. In this work, we aimed for a broad scope, touching superficially on many different points to shed light on how an end-to-end study of reward functions for recommender systems might be done. Recommender alignment is a pressing and important problem. Attempted solutions are sure to have far-reaching impacts. Here, we take a first step in developing methods to evaluating and comparing solutions with respect to their impacts on society.

Related papers

Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User [117.82681846559909]
Conversational recommendation systems (CRSs) use multi-turn interaction to capture user preferences and provide personalized recommendations. We propose a generative reward model based simulated user, named GRSU, for automatic interaction with CRSs.
arXiv Detail & Related papers (2025-04-29T06:37:30Z)
Why am I seeing this? Towards recognizing social media recommender systems with missing recommendations [4.242821809663174]
We introduce a method for Automatic Recommender Recognition using Graph Neural Networks (GNNs) Our approach enables accurate detection of hidden recommenders and their influence on user behavior. This study provides insights into how recommenders shape behavior, aiding efforts to reduce polarization and misinformation.
arXiv Detail & Related papers (2025-04-15T09:16:17Z)
Interactive Visualization Recommendation with Hier-SUCB [52.11209329270573]
We propose an interactive personalized visualization recommendation (PVisRec) system that learns on user feedback from previous interactions. For more interactive and accurate recommendations, we propose Hier-SUCB, a contextual semi-bandit in the PVisRec setting.
arXiv Detail & Related papers (2025-02-05T17:14:45Z)
Algorithmic Drift: A Simulation Framework to Study the Effects of Recommender Systems on User Preferences [7.552217586057245]
We propose a simulation framework that mimics user-recommender system interactions in a long-term scenario. We introduce two novel metrics for quantifying the algorithm's impact on user preferences, specifically in terms of drift over time.
arXiv Detail & Related papers (2024-09-24T21:54:22Z)
Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and Method [60.364834418531366]
We propose five new evaluation metrics that comprehensively and accurately assess the performance of RRS. We formulate the RRS from a causal perspective, formulating recommendations as bilateral interventions. We introduce a reranking strategy to maximize matching outcomes, as measured by the proposed metrics.
arXiv Detail & Related papers (2024-08-19T07:21:02Z)
The Nah Bandit: Modeling User Non-compliance in Recommendation Systems [2.421459418045937]
Expert with Clustering (EWC) is a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning. EWC outperforms both supervised learning and traditional contextual bandit approaches. This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
arXiv Detail & Related papers (2024-08-15T03:01:02Z)
Harm Mitigation in Recommender Systems under User Preference Dynamics [16.213153879446796]
We consider a recommender system that takes into account the interplay between recommendations, user interests, and harmful content. We seek recommendation policies that establish a tradeoff between maximizing click-through rate (CTR) and mitigating harm.
arXiv Detail & Related papers (2024-06-14T09:52:47Z)
REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and user intentions, values, or social norms can be catastrophic in the real world. Current methods to mitigate this misalignment work by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
RAH! RecSys-Assistant-Human: A Human-Centered Recommendation Framework with LLM Agents [30.250555783628762]
This research argues that addressing these issues is not solely the recommender systems' responsibility. We introduce the RAH Recommender system, Assistant, and Human framework, emphasizing the alignment with user personalities. Our contributions provide a human-centered recommendation framework that partners effectively with various recommendation models.
arXiv Detail & Related papers (2023-08-19T04:46:01Z)
Breaking Feedback Loops in Recommender Systems with Causal Inference [99.22185950608838]
Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior. We propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference. We show that CAFL improves recommendation quality when compared to prior correction methods.
arXiv Detail & Related papers (2022-07-04T17:58:39Z)
Meta Policy Learning for Cold-Start Conversational Recommendation [71.13044166814186]
We study CRS policy learning for cold-start users via meta reinforcement learning. To facilitate policy adaptation, we design three synergetic components.
arXiv Detail & Related papers (2022-05-24T05:06:52Z)
Do Offline Metrics Predict Online Performance in Recommender Systems? [79.48653445643865]
We investigate the extent to which offline metrics predict online performance by evaluating recommenders across six simulated environments. We observe that offline metrics are correlated with online performance over a range of environments. We study the impact of adding exploration strategies, and observe that their effectiveness, when compared to greedy recommendation, is highly dependent on the recommendation algorithm.
arXiv Detail & Related papers (2020-11-07T01:41:13Z)
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach [36.54379845220444]
We study settings in which content providers cannot remain viable unless they receive a certain level of user engagement. Our model ensures the system reaches an equilibrium with maximal social welfare supported by a sufficiently diverse set of viable providers. We draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.
arXiv Detail & Related papers (2020-07-31T22:40:47Z)
Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks. Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL. Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.