Related papers: Incentivized Exploration via Filtered Posterior Sampling

Incentivized Exploration via Filtered Posterior Sampling

URL: http://arxiv.org/abs/2402.13338v1
Date: Tue, 20 Feb 2024 19:30:55 GMT
Title: Incentivized Exploration via Filtered Posterior Sampling
Authors: Anand Kalvit, Aleksandrs Slivkins, Yonatan Gur
Abstract summary: We study "incentivized exploration" (IE) in social learning problems where the principal can leverage information asymmetry to incentivize agents to take exploratory actions. We identify posterior sampling, an algorithmic approach that is well known in the multi-armed bandits literature, as a general-purpose solution for IE.
Score: 51.32577788466152
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study "incentivized exploration" (IE) in social learning problems where the principal (a recommendation algorithm) can leverage information asymmetry to incentivize sequentially-arriving agents to take exploratory actions. We identify posterior sampling, an algorithmic approach that is well known in the multi-armed bandits literature, as a general-purpose solution for IE. In particular, we expand the existing scope of IE in several practically-relevant dimensions, from private agent types to informative recommendations to correlated Bayesian priors. We obtain a general analysis of posterior sampling in IE which allows us to subsume these extended settings as corollaries, while also recovering existing results as special cases.

Related papers

Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation [0.8213829427624406]
Multi-Armed Bandit (MAB) algorithms are widely used in recommender systems that require continuous, incremental learning.<n>This study conducts an extensive offline empirical comparison of several linear MABs.<n>Strikingly, across over 90% of various datasets, a greedy linear model, with no type of exploration, consistently achieves top-tier performance.
arXiv Detail & Related papers (2025-07-24T19:14:39Z)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z)
From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents [96.65646344634524]
Large Language Models (LLMs), endowed with reasoning and agentic capabilities, are ushering in a new paradigm termed Agentic Deep Research.<n>We trace the evolution from static web search to interactive, agent-based systems that plan, explore, and learn.<n>We demonstrate that Agentic Deep Research not only significantly outperforms existing approaches, but is also poised to become the dominant paradigm for future information seeking.
arXiv Detail & Related papers (2025-06-23T17:27:19Z)
Insight-RAG: Enhancing LLMs with Insight-Driven Augmentation [4.390998479503661]
We propose Insight-RAG, a novel framework designed to retrieve documents based on insights. In the initial stage of Insight-RAG, instead of using traditional retrieval methods, we employ an LLM to analyze the input query and task. By integrating the original query with the retrieved insights, similar to conventional RAG approaches, we employ a final LLM to generate a contextually enriched and accurate response.
arXiv Detail & Related papers (2025-03-31T19:50:27Z)
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction [20.6787276745193]
We introduce an automatic evaluation method that measures retrieval quality through the lens of information gain within the RAG framework. We quantify the utility of retrieval by the extent to which it reduces semantic perplexity post-retrieval.
arXiv Detail & Related papers (2025-03-03T12:37:34Z)
A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications [52.42860559005861]
Direct Preference Optimization (DPO) has emerged as a promising approach for alignment. Despite DPO's various advancements and inherent limitations, an in-depth review of these aspects is currently lacking in the literature.
arXiv Detail & Related papers (2024-10-21T02:27:24Z)
Data Augmentation for Sequential Recommendation: A Survey [9.913317029557588]
sequential recommendation (SR) has received much attention due to its well-consistency with real-world situations. We provide a comprehensive review of data augmentation (DA) methods for SR.
arXiv Detail & Related papers (2024-09-20T14:39:42Z)
Unified Domain Adaptive Semantic Segmentation [96.74199626935294]
Unsupervised Adaptive Domain Semantic (UDA-SS) aims to transfer the supervision from a labeled source domain to an unlabeled target domain. We propose a Quad-directional Mixup (QuadMix) method, characterized by tackling distinct point attributes and feature inconsistencies. Our method outperforms the state-of-the-art works by large margins on four challenging UDA-SS benchmarks.
arXiv Detail & Related papers (2023-11-22T09:18:49Z)
On the Importance of Exploration for Generalization in Reinforcement Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z)
Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models [33.43062232461652]
Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems. This survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED)
arXiv Detail & Related papers (2023-02-10T07:11:37Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Back-to-Bones: Rediscovering the Role of Backbones in Domain Generalization [1.6799377888527687]
Domain Generalization studies the capability of a deep learning model to generalize to out-of-training distributions. Recent research has provided a reproducible benchmark for DG, pointing out the effectiveness of naive empirical risk minimization (ERM) over existing algorithms. In this paper, we evaluate the backbones proposing a comprehensive analysis of their intrinsic generalization capabilities.
arXiv Detail & Related papers (2022-09-02T15:30:17Z)
Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems. We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z)
Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.