Related papers: On the Optimality of Tracking Fisher Information in Adaptive Testing with Stochastic Binary Responses

On the Optimality of Tracking Fisher Information in Adaptive Testing with Stochastic Binary Responses

URL: http://arxiv.org/abs/2510.07862v1
Date: Thu, 09 Oct 2025 07:10:00 GMT
Title: On the Optimality of Tracking Fisher Information in Adaptive Testing with Stochastic Binary Responses
Authors: Sanghwa Kim, Dohyun Ahn, Seungki Min,
Abstract summary: We study the problem of estimating a continuous ability parameter from sequential binary responses.<n>We propose a simple algorithm that adaptively selects questions to maximize Fisher information.<n>We prove that this Fisher-tracking strategy achieves optimal performance in both fixed-confidence and fixed-budget regimes.
Score: 3.491999371287298
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of estimating a continuous ability parameter from sequential binary responses by actively asking questions with varying difficulties, a setting that arises naturally in adaptive testing and online preference learning. Our goal is to certify that the estimate lies within a desired margin of error, using as few queries as possible. We propose a simple algorithm that adaptively selects questions to maximize Fisher information and updates the estimate using a method-of-moments approach, paired with a novel test statistic to decide when the estimate is accurate enough. We prove that this Fisher-tracking strategy achieves optimal performance in both fixed-confidence and fixed-budget regimes, which are commonly invested in the best-arm identification literature. Our analysis overcomes a key technical challenge in the fixed-budget setting -- handling the dependence between the evolving estimate and the query distribution -- by exploiting a structural symmetry in the model and combining large deviation tools with Ville's inequality. Our results provide rigorous theoretical support for simple and efficient adaptive testing procedures.

Related papers

Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems [54.916243942641444]
Large language models (LLMs) are emerging as key enablers of automation in domains such as telecommunications.<n>We study an edge-cloud-expert cascaded LLM-based knowledge system that supports decision-making through a question-and-answer pipeline.
arXiv Detail & Related papers (2025-12-23T03:10:09Z)
Machine learning to optimize precision in the analysis of randomized trials: A journey in pre-specified, yet data-adaptive learning [2.6827221447298406]
We tell our story of developing, evaluating, and implementing a machine learning-based approach for covariate adjustment.<n>We provide the rationale for as well as the practical concerns with such an approach for estimating marginal effects.<n>We present the results from applying our approach in the primary, pre-specified analysis of 8 recently published trials.
arXiv Detail & Related papers (2025-12-15T18:05:45Z)
Statistical Inference for Misspecified Contextual Bandits [6.178061357164435]
Contextual bandit algorithms have transformed modern experimentation by enabling real-time adaptation for personalized treatment.<n>Yet these advantages create challenges for statistical inference due to adaptivity.<n> Convergence ensures replicability of adaptive experiments and stability of online algorithms.
arXiv Detail & Related papers (2025-09-08T02:19:37Z)
Semiparametric Counterfactual Regression [2.356908851188234]
We propose a doubly robust-style estimator for counterfactual regression within a generalizable framework.<n>Our approach uses incremental interventions to enhance adaptability while maintaining with standard methods.<n>Our analysis shows that the proposed estimators can achieve $sqrn$-consistency and normality for a broad class of problems.
arXiv Detail & Related papers (2025-04-03T15:32:26Z)
Optimal Adaptive Experimental Design for Estimating Treatment Effect [14.088972921434761]
This paper addresses the fundamental question of determining the optimal accuracy in estimating the treatment effect. By incorporating the concept of doubly robust method into sequential experimental design, we frame the optimal estimation problem as an online bandit learning problem. Using tools and ideas from both bandit algorithm design and adaptive statistical estimation, we propose a general low switching adaptive experiment framework.
arXiv Detail & Related papers (2024-10-07T23:22:51Z)
Pattern based learning and optimisation through pricing for bin packing problem [50.83768979636913]
We argue that when problem conditions such as the distributions of random variables change, the patterns that performed well in previous circumstances may become less effective. We propose a novel scheme to efficiently identify patterns and dynamically quantify their values for each specific condition. Our method quantifies the value of patterns based on their ability to satisfy constraints and their effects on the objective value.
arXiv Detail & Related papers (2024-08-27T17:03:48Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
Globally-Optimal Greedy Experiment Selection for Active Sequential Estimation [1.1530723302736279]
We study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data. The goal is to design experiment selection rules for more accurate model estimation. We propose a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood.
arXiv Detail & Related papers (2024-02-13T17:09:29Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
Online Statistical Inference in Decision-Making with Matrix Context [5.2071564436846245]
We propose an online procedure to conduct statistical inference with adaptively collected data.<n>Standard low-rank estimators are biased and cannot be obtained in a sequential manner.<n>Existing approaches in sequential decision-making algorithms fail to account for the low-rankness and are also biased.
arXiv Detail & Related papers (2022-12-21T22:03:06Z)
Post-Contextual-Bandit Inference [57.88785630755165]
Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking. They can both improve outcomes for study participants and increase the chance of identifying good or even best policies. To support credible inference on novel interventions at the end of the study, we still want to construct valid confidence intervals on average treatment effects, subgroup effects, or value of new policies.
arXiv Detail & Related papers (2021-06-01T12:01:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.