Adaptive Off-Policy Inference for M-Estimators Under Model Misspecification
- URL: http://arxiv.org/abs/2509.14218v1
- Date: Wed, 17 Sep 2025 17:51:40 GMT
- Title: Adaptive Off-Policy Inference for M-Estimators Under Model Misspecification
- Authors: James Leiner, Robin Dunn, Aaditya Ramdas,
- Abstract summary: We propose a method that provides valid inference for M-estimators that use adaptively collected bandit data.<n>A key ingredient in our approach is the use of flexible machine learning approaches to stabilize the variance induced by adaptive data collection.
- Score: 32.7750904494144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When data are collected adaptively, such as in bandit algorithms, classical statistical approaches such as ordinary least squares and $M$-estimation will often fail to achieve asymptotic normality. Although recent lines of work have modified the classical approaches to ensure valid inference on adaptively collected data, most of these works assume that the model is correctly specified. We propose a method that provides valid inference for M-estimators that use adaptively collected bandit data with a (possibly) misspecified working model. A key ingredient in our approach is the use of flexible machine learning approaches to stabilize the variance induced by adaptive data collection. A major novelty is that our procedure enables the construction of valid confidence sets even in settings where treatment policies are unstable and non-converging, such as when there is no unique optimal arm and standard bandit algorithms are used. Empirical results on semi-synthetic datasets constructed from the Osteoarthritis Initiative demonstrate that the method maintains type I error control, while existing methods for inference in adaptive settings do not cover in the misspecified case.
Related papers
- DANCE: Doubly Adaptive Neighborhood Conformal Estimation [12.643121779828526]
We propose a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation.<n>We test against state-of-the-art local, task-adapted and zero-shot conformal baselines, demonstrating DANCE's superior blend of set size efficiency and robustness across various datasets.
arXiv Detail & Related papers (2026-02-24T07:54:53Z) - Statistical Inference for Misspecified Contextual Bandits [6.178061357164435]
Contextual bandit algorithms have transformed modern experimentation by enabling real-time adaptation for personalized treatment.<n>Yet these advantages create challenges for statistical inference due to adaptivity.<n> Convergence ensures replicability of adaptive experiments and stability of online algorithms.
arXiv Detail & Related papers (2025-09-08T02:19:37Z) - Conformal and kNN Predictive Uncertainty Quantification Algorithms in Metric Spaces [3.637162892228131]
We develop a conformal prediction algorithm that offers finite-sample coverage guarantees and fast convergence rates of the oracle estimator.<n>In heteroscedastic settings, we forgo these non-asymptotic guarantees to gain statistical efficiency.<n>We demonstrate the practical utility of our approach in personalized--medicine applications involving random response objects.
arXiv Detail & Related papers (2025-07-21T15:54:13Z) - Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.<n>We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z) - Adaptive Conformal Inference by Betting [51.272991377903274]
We consider the problem of adaptive conformal inference without any assumptions about the data generating process.<n>Existing approaches for adaptive conformal inference are based on optimizing the pinball loss using variants of online gradient descent.<n>We propose a different approach for adaptive conformal inference that leverages parameter-free online convex optimization techniques.
arXiv Detail & Related papers (2024-12-26T18:42:08Z) - Multi-model Ensemble Conformal Prediction in Dynamic Environments [14.188004615463742]
We introduce a novel adaptive conformal prediction framework, where the model used for creating prediction sets is selected on the fly from multiple candidate models.
The proposed algorithm is proven to achieve strongly adaptive regret over all intervals while maintaining valid coverage.
arXiv Detail & Related papers (2024-11-06T05:57:28Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Statistical Inference After Adaptive Sampling for Longitudinal Data [9.468593929311867]
We develop novel methods to perform a variety of statistical analyses on adaptively sampled data via Z-estimation.
We develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest.
arXiv Detail & Related papers (2022-02-14T23:48:13Z) - Post-Contextual-Bandit Inference [57.88785630755165]
Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking.
They can both improve outcomes for study participants and increase the chance of identifying good or even best policies.
To support credible inference on novel interventions at the end of the study, we still want to construct valid confidence intervals on average treatment effects, subgroup effects, or value of new policies.
arXiv Detail & Related papers (2021-06-01T12:01:51Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.