Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
- URL: http://arxiv.org/abs/2411.14341v1
- Date: Thu, 21 Nov 2024 17:38:49 GMT
- Title: Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
- Authors: Ojash Neopane, Aaditya Ramdas, Aarti Singh,
- Abstract summary: Estimation of the Average Treatment Effect (ATE) is a core problem in causal inference with strong connections to Off-Policy Evaluation in Reinforcement Learning.
This paper considers the problem of adaptively selecting the treatment allocation probability in order to improve estimation of the ATE.
- Score: 36.25361703897723
- License:
- Abstract: Estimation of the Average Treatment Effect (ATE) is a core problem in causal inference with strong connections to Off-Policy Evaluation in Reinforcement Learning. This paper considers the problem of adaptively selecting the treatment allocation probability in order to improve estimation of the ATE. The majority of prior work on adaptive ATE estimation focus on asymptotic guarantees, and in turn overlooks important practical considerations such as the difficulty of learning the optimal treatment allocation as well as hyper-parameter selection. Existing non-asymptotic methods are limited by poor empirical performance and exponential scaling of the Neyman regret with respect to problem parameters. In order to address these gaps, we propose and analyze the Clipped Second Moment Tracking (ClipSMT) algorithm, a variant of an existing algorithm with strong asymptotic optimality guarantees, and provide finite sample bounds on its Neyman regret. Our analysis shows that ClipSMT achieves exponential improvements in Neyman regret on two fronts: improving the dependence on $T$ from $O(\sqrt{T})$ to $O(\log T)$, as well as reducing the exponential dependence on problem parameters to a polynomial dependence. Finally, we conclude with simulations which show the marked improvement of ClipSMT over existing approaches.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Online Learning Approach for Survival Analysis [1.0499611180329806]
We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data.
This framework enables the estimation of event time distributions through an optimal second order online convex optimization algorithm-Online Newton Step (ONS)
arXiv Detail & Related papers (2024-02-07T08:15:30Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Parameter-Agnostic Optimization under Relaxed Smoothness [25.608968462899316]
We show that Normalized Gradient Descent with Momentum (NSGD-M) can achieve a rate-optimal complexity without prior knowledge of any problem parameter.
In deterministic settings, the exponential factor can be neutralized by employing Gradient Descent with a Backtracking Line Search.
arXiv Detail & Related papers (2023-11-06T16:39:53Z) - Inference on Optimal Dynamic Policies via Softmax Approximation [27.396891119011215]
We show that a simple soft-max approximation to the optimal treatment regime can achieve valid inference on the truly optimal regime.
Our work combines techniques from semi-parametric inference and $g$-estimation, together with an appropriate array central limit theorem.
arXiv Detail & Related papers (2023-03-08T07:42:47Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for
Unbounded Functions [23.746620619512573]
Recent work overcomes the effect of having to compute gradients of "megabatches"
Work is widely used after update with competitive deep learning tasks.
arXiv Detail & Related papers (2022-09-29T15:12:54Z) - AdaTerm: Adaptive T-Distribution Estimated Robust Moments for
Noise-Robust Stochastic Gradient Optimization [14.531550983885772]
We propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all associated statistics.
This provides a unified treatment of the optimization process, offering a comprehensive framework under the statistical model of the t-distribution for the first time.
arXiv Detail & Related papers (2022-01-18T03:13:19Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms.
We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients.
We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z) - Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis [102.29671176698373]
We address the problem of policy evaluation in discounted decision processes, and provide Markov-dependent guarantees on the $ell_infty$error under a generative model.
We establish both and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms.
arXiv Detail & Related papers (2020-03-16T17:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.