Towards the D-Optimal Online Experiment Design for Recommender Selection
- URL: http://arxiv.org/abs/2110.12132v1
- Date: Sat, 23 Oct 2021 04:30:27 GMT
- Title: Towards the D-Optimal Online Experiment Design for Recommender Selection
- Authors: Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, Kannan Achan
- Abstract summary: Finding the optimal online experiment is nontrivial since both the users and displayed recommendations carry contextual features that are informative to the reward.
We leverage the emphD-optimal design from the classical statistics literature to achieve the maximum information gain during exploration.
We then use our deployment example on Walmart.com to fully illustrate the practical insights and effectiveness of the proposed methods.
- Score: 18.204325860752768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Selecting the optimal recommender via online exploration-exploitation is
catching increasing attention where the traditional A/B testing can be slow and
costly, and offline evaluations are prone to the bias of history data. Finding
the optimal online experiment is nontrivial since both the users and displayed
recommendations carry contextual features that are informative to the reward.
While the problem can be formalized via the lens of multi-armed bandits, the
existing solutions are found less satisfactorily because the general
methodologies do not account for the case-specific structures, particularly for
the e-commerce recommendation we study. To fill in the gap, we leverage the
\emph{D-optimal design} from the classical statistics literature to achieve the
maximum information gain during exploration, and reveal how it fits seamlessly
with the modern infrastructure of online inference. To demonstrate the
effectiveness of the optimal designs, we provide semi-synthetic simulation
studies with published code and data for reproducibility purposes. We then use
our deployment example on Walmart.com to fully illustrate the practical
insights and effectiveness of the proposed methods.
Related papers
- Preference Elicitation for Offline Reinforcement Learning [59.136381500967744]
We propose Sim-OPRL, an offline preference-based reinforcement learning algorithm.
Our algorithm employs a pessimistic approach for out-of-distribution data, and an optimistic approach for acquiring informative preferences about the optimal policy.
arXiv Detail & Related papers (2024-06-26T15:59:13Z) - ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems [40.838320650137625]
This paper presents ERASE, a comprehensive bEnchmaRk for feAture SElection for Deep Recommender Systems (DRS)
ERASE comprises a thorough evaluation of eleven feature selection methods, covering both traditional and deep learning approaches.
Our code is available online for ease of reproduction.
arXiv Detail & Related papers (2024-03-19T11:49:35Z) - Enhanced Bayesian Optimization via Preferential Modeling of Abstract
Properties [49.351577714596544]
We propose a human-AI collaborative Bayesian framework to incorporate expert preferences about unmeasured abstract properties into surrogate modeling.
We provide an efficient strategy that can also handle any incorrect/misleading expert bias in preferential judgments.
arXiv Detail & Related papers (2024-02-27T09:23:13Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - CO-BED: Information-Theoretic Contextual Optimization via Bayesian
Experimental Design [31.247108087199095]
CO-BED is a model-agnostic framework for designing contextual experiments using information-theoretic principles.
As a result, CO-BED provides a general and automated solution to a wide range of contextual optimization problems.
arXiv Detail & Related papers (2023-02-27T18:14:13Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - Efficient Real-world Testing of Causal Decision Making via Bayesian
Experimental Design for Contextual Optimisation [12.37745209793872]
We introduce a model-agnostic framework for gathering data to evaluate and improve contextual decision making.
Our method is used for the data-efficient evaluation of the regret of past treatment assignments.
arXiv Detail & Related papers (2022-07-12T01:20:11Z) - A Field Guide to Federated Optimization [161.3779046812383]
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data.
This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms.
arXiv Detail & Related papers (2021-07-14T18:09:08Z) - Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round.
Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z) - Robust Active Preference Elicitation [10.961537256186498]
We study the problem of eliciting the preferences of a decision-maker through a moderate number of pairwise comparison queries.
We are motivated by applications in high stakes domains, such as when choosing a policy for allocating scarce resources.
arXiv Detail & Related papers (2020-03-04T05:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.