Related papers: Incentive-Aware Recommender Systems in Two-Sided Markets

Incentive-Aware Recommender Systems in Two-Sided Markets

URL: http://arxiv.org/abs/2211.15381v2
Date: Tue, 18 Jun 2024 05:45:41 GMT
Title: Incentive-Aware Recommender Systems in Two-Sided Markets
Authors: Xiaowu Dai, Wenlu Xu, Yuan Qi, Michael I. Jordan,
Abstract summary: We propose a novel recommender system that aligns with agents' incentives while achieving myopically optimal performance. Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets. Both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation.
Score: 49.692453629365204
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online platforms in the Internet Economy commonly incorporate recommender systems that recommend products (or "arms") to users (or "agents"). A key challenge in this domain arises from myopic agents who are naturally incentivized to exploit by choosing the optimal arm based on current information, rather than exploring various alternatives to gather information that benefits the collective. We propose a novel recommender system that aligns with agents' incentives while achieving asymptotically optimal performance, as measured by regret in repeated interactions. Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets, where the interactions of agents and arms are facilitated by recommender systems on online platforms. This model incorporates incentive constraints induced by agents' opportunity costs. In scenarios where opportunity costs are known to the platform, we show the existence of an incentive-compatible recommendation algorithm. This algorithm pools recommendations between a genuinely good arm and an unknown arm using a randomized and adaptive strategy. Moreover, when these opportunity costs are unknown, we introduce an algorithm that randomly pools recommendations across all arms, utilizing the cumulative loss from each arm as feedback for strategic exploration. We demonstrate that both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation. All code for using the proposed algorithms and reproducing results is made available on GitHub.

Related papers

COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents [41.57721032039409]
Existing work in contextual bandits assumes that agents truthfully report their arms, which is unrealistic in many real-life applications.<n>We propose an algorithm, COBRA, for contextual bandit problems involving strategic agents that disincentivize their strategic behavior without using any monetary incentives.
arXiv Detail & Related papers (2025-05-29T17:53:12Z)
Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [52.75161794035767]
We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously. We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
arXiv Detail & Related papers (2023-12-13T06:54:49Z)
Algorithmic Collusion or Competition: the Role of Platforms' Recommender Systems [2.551933457838182]
This study examines how recommendation algorithms can determine the competitive or collusive dynamics of AI-based pricing algorithms. Experimental results reveal that a profit-based recommender system intensifies algorithmic collusion among sellers due to its congruence with sellers' profit-maximizing objectives. A demand-based recommender system fosters price competition among sellers and results in a lower price, owing to its misalignment with sellers' goals.
arXiv Detail & Related papers (2023-09-25T21:45:30Z)
Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward. We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy. We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z)
Interactive Recommendations for Optimal Allocations in Markets with Constraints [12.580391999838128]
We propose an interactive framework where the system provider can enhance the quality of recommendations to the users. We employ an integrated approach using techniques from collaborative filtering, bandits, and optimal resource allocation. Empirical studies on synthetic matrix and real-world data also demonstrate the effectiveness and performance of our approach.
arXiv Detail & Related papers (2022-07-08T22:16:51Z)
Information-Gathering in Latent Bandits [79.6953033727455]
We present a method for information-gathering in latent bandits. We show that choosing the best arm given the agent's beliefs about the states incurs higher regret. We also show that by choosing arms carefully, we obtain an improved estimation of the state distribution.
arXiv Detail & Related papers (2022-07-08T01:15:12Z)
Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system. Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations. While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z)
Modeling Attrition in Recommender Systems with Departing Bandits [84.85560764274399]
We propose a novel multi-armed bandit setup that captures policy-dependent horizons. We first address the case where all users share the same type, demonstrating that a recent UCB-based algorithm is optimal. We then move forward to the more challenging case, where users are divided among two types.
arXiv Detail & Related papers (2022-03-25T02:30:54Z)
GHRS: Graph-based Hybrid Recommendation System with Application to Movie Recommendation [0.0]
We propose a recommender system method using a graph-based model associated with the similarity of users' ratings. By utilizing the advantages of Autoencoder feature extraction, we extract new features based on all combined attributes. The experimental results on the MovieLens dataset show that the proposed algorithm outperforms many existing recommendation algorithms on recommendation accuracy.
arXiv Detail & Related papers (2021-11-06T10:47:45Z)
Learning the Optimal Recommendation from Explorative Users [38.332330484187395]
We study the sequential interactions between a recommender system and a user. We show that efficient system learning is still possible but is more difficult.
arXiv Detail & Related papers (2021-10-06T21:01:18Z)
Robust Multi-Agent Multi-Armed Bandits [26.26185074977412]
Recent works have shown that agents facing independent instances of a $K$-armed bandit can collaborate to decrease regret. We show that collaboration indeed decreases regret for this algorithm, assuming $m$ is small compared to $K$ but without assumptions on malicious agents' behavior.
arXiv Detail & Related papers (2020-07-07T22:27:30Z)
Incentivizing Exploration with Selective Data Disclosure [70.11902902106014]
We propose and design recommendation systems that incentivize efficient exploration. Agents arrive sequentially, choose actions and receive rewards, drawn from fixed but unknown action-specific distributions. We attain optimal regret rate for exploration using a flexible frequentist behavioral model.
arXiv Detail & Related papers (2018-11-14T19:29:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.