Diffusion Approximations for a Class of Sequential Testing Problems
- URL: http://arxiv.org/abs/2102.07030v1
- Date: Sat, 13 Feb 2021 23:21:29 GMT
- Title: Diffusion Approximations for a Class of Sequential Testing Problems
- Authors: Victor F. Araman, Rene Caldentey
- Abstract summary: We study the problem of a seller who wants to select an optimal assortment of products to launch into the marketplace.
Motivated by emerging practices in e-commerce, we assume that the seller is able to use a crowdvoting system to learn these preferences.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a decision maker who must choose an action in order to maximize a
reward function that depends also on an unknown parameter {\Theta}. The
decision maker can delay taking the action in order to experiment and gather
additional information on {\Theta}. We model the decision maker's problem using
a Bayesian sequential experimentation framework and use dynamic programming and
diffusion-asymptotic analysis to solve it. For that, we scale our problem in a
way that both the average number of experiments that is conducted per unit of
time is large and the informativeness of each individual experiment is low.
Under such regime, we derive a diffusion approximation for the sequential
experimentation problem, which provides a number of important insights about
the nature of the problem and its solution. Our solution method also shows that
the complexity of the problem grows only quadratically with the cardinality of
the set of actions from which the decision maker can choose. We illustrate our
methodology and results using a concrete application in the context of
assortment selection and new product introduction. Specifically, we study the
problem of a seller who wants to select an optimal assortment of products to
launch into the marketplace and is uncertain about consumers' preferences.
Motivated by emerging practices in e-commerce, we assume that the seller is
able to use a crowdvoting system to learn these preferences before a final
assortment decision is made. In this context, we undertake an extensive
numerical analysis to assess the value of learning and demonstrate the
effectiveness and robustness of the heuristics derived from the diffusion
approximation.
Related papers
- Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - Decision-Focused Forecasting: Decision Losses for Multistage Optimisation [0.0]
We propose decision-focused forecasting, a multiple-implicitlayer model which in its training accounts for the intertemporal decision effects of forecasts using differentiable optimisation.
We present an analysis of the gradients produced by this model showing the adjustments made to account for the state-path caused by forecasting.
We demonstrate an application of the model to an energy storage arbitrage task and report that our model outperforms existing approaches.
arXiv Detail & Related papers (2024-05-23T15:48:46Z) - Globally-Optimal Greedy Experiment Selection for Active Sequential
Estimation [1.1530723302736279]
We study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data.
The goal is to design experiment selection rules for more accurate model estimation.
We propose a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood.
arXiv Detail & Related papers (2024-02-13T17:09:29Z) - Learning Fair Policies for Multi-stage Selection Problems from
Observational Data [4.282745020665833]
We consider the problem of learning fair policies for multi-stage selection problems from observational data.
This problem arises in several high-stakes domains such as company hiring, loan approval, or bail decisions where outcomes are only observed for those selected.
We propose a multi-stage framework that can be augmented with various fairness constraints, such as demographic parity or equal opportunity.
arXiv Detail & Related papers (2023-12-20T16:33:15Z) - Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Experimentation Platforms Meet Reinforcement Learning: Bayesian
Sequential Decision-Making for Continuous Monitoring [13.62951379287041]
In this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost.
We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function.
We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.
arXiv Detail & Related papers (2023-04-02T00:59:10Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Efficient Real-world Testing of Causal Decision Making via Bayesian
Experimental Design for Contextual Optimisation [12.37745209793872]
We introduce a model-agnostic framework for gathering data to evaluate and improve contextual decision making.
Our method is used for the data-efficient evaluation of the regret of past treatment assignments.
arXiv Detail & Related papers (2022-07-12T01:20:11Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Inverse Active Sensing: Modeling and Understanding Timely
Decision-Making [111.07204912245841]
We develop a framework for the general setting of evidence-based decision-making under endogenous, context-dependent time pressure.
We demonstrate how it enables modeling intuitive notions of surprise, suspense, and optimality in decision strategies.
arXiv Detail & Related papers (2020-06-25T02:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.