Related papers: Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

URL: http://arxiv.org/abs/2301.01107v1
Date: Tue, 3 Jan 2023 14:04:13 GMT
Title: Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards
Authors: James K. He, Sof\'ia S. Villar, and Lida Mavrogonatou
Abstract summary: The Gittins Index (GI) is a solution to the Multi-Armed Bandit Problem that can simultaneously attain optimality and computationally efficiency goals. We present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning but substantially better in earning.
Score: 1.1470070927586016
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

Related papers

Bayesian Experimental Design via Contrastive Diffusions [2.2186678387006435]
Experimental Design (BOED) is a powerful tool to reduce the cost of running a sequence of experiments. We introduce an it expected posterior distribution with cost-effective properties and provide a tractable access to the EIG contrast. By incorporating generative models into the BOED framework, we expand its scope and its use in scenarios that were previously impractical.
arXiv Detail & Related papers (2024-10-15T17:53:07Z)
Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification [9.030753181146176]
We propose a unified model that simultaneously accounts for within-experiment performance and post-experiment outcomes. We show that substantial reductions in experiment duration can often be achieved with minimal impact on both within-experiment and post-experiment regret.
arXiv Detail & Related papers (2024-02-16T11:27:48Z)
Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables. In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy. Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z)
Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms. It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL. This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z)
Advancing Model Pruning via Bi-level Optimization [89.88761425199598]
iterative magnitude pruning (IMP) is the predominant pruning method to successfully find 'winning tickets' One-shot pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP. We show that the proposed bi-level optimization-oriented pruning method (termed BiP) is a special class of BLO problems with a bi-linear problem structure.
arXiv Detail & Related papers (2022-10-08T19:19:29Z)
Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG) We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z)
Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation. We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation. We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z)
Test-time Batch Normalization [61.292862024903584]
Deep neural networks often suffer the data distribution shift between training and testing. We revisit the batch normalization (BN) in the training process and reveal two key insights benefiting test-time optimization. We propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss.
arXiv Detail & Related papers (2022-05-20T14:33:39Z)
Challenges in Statistical Analysis of Data Collected by a Bandit Algorithm: An Empirical Exploration in Applications to Adaptively Randomized Experiments [11.464963616709671]
Multi-armed bandit algorithms have been argued for decades as useful for adaptively randomized experiments. We applied the bandit algorithm Thompson Sampling (TS) to run adaptive experiments in three university classes. We show that collecting data with TS can as much as double the False Positive Rate (FPR) and the False Negative Rate (FNR)
arXiv Detail & Related papers (2021-03-22T22:05:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.