Automatic Doubly Robust Forests
- URL: http://arxiv.org/abs/2412.07184v2
- Date: Sun, 08 Jun 2025 05:30:17 GMT
- Title: Automatic Doubly Robust Forests
- Authors: Zhaomeng Chen, Junting Duan, Victor Chernozhukov, Vasilis Syrgkanis,
- Abstract summary: This paper proposes the automatic Doubly Robust Random Forest (DRRF) algorithm for estimating the conditional expectation of a moment functional in the presence of high-dimensional nuisance functions.<n>DRRF does not require prior knowledge of the form of the debiasing term or impose restrictive parametric assumptions on the target quantity.<n>We demonstrate the superior performance of DRRF over benchmark approaches in terms of estimation accuracy, robustness, and computational efficiency.
- Score: 18.700557484394544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes the automatic Doubly Robust Random Forest (DRRF) algorithm for estimating the conditional expectation of a moment functional in the presence of high-dimensional nuisance functions. DRRF extends the automatic debiasing framework based on the Riesz representer to the conditional setting and enables nonparametric, forest-based estimation (Athey et al., 2019; Oprescu et al., 2019). In contrast to existing methods, DRRF does not require prior knowledge of the form of the debiasing term or impose restrictive parametric or semi-parametric assumptions on the target quantity. Additionally, it is computationally efficient in making predictions at multiple query points. We establish consistency and asymptotic normality results for the DRRF estimator under general assumptions, allowing for the construction of valid confidence intervals. Through extensive simulations in heterogeneous treatment effect (HTE) estimation, we demonstrate the superior performance of DRRF over benchmark approaches in terms of estimation accuracy, robustness, and computational efficiency.
Related papers
- COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z) - Principled Input-Output-Conditioned Post-Hoc Uncertainty Estimation for Regression Networks [1.4671424999873808]
Uncertainty is critical in safety-sensitive applications but is often omitted from off-the-shelf neural networks due to adverse effects on predictive performance.<n>We propose a theoretically grounded framework for post-hoc uncertainty estimation in regression tasks by fitting an auxiliary model to both original inputs and frozen model outputs.
arXiv Detail & Related papers (2025-06-01T09:13:27Z) - RF-BayesPhysNet: A Bayesian rPPG Uncertainty Estimation Method for Complex Scenarios [5.349703489635052]
Remote photoplethys technology infers heart rate by capturing subtle color changes in facial skin using a camera.
measurement accuracy significantly decreases in complex scenarios.
Deep learning models often neglect of measurement uncertainty, limiting their credibility in dynamic scenes.
arXiv Detail & Related papers (2025-04-04T20:24:57Z) - Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model [4.0148499400442095]
This paper deals with the identification of the derivation Ornstein-Uhlenbeck (OU) process error model.
We put forth a sample-efficient global optimization approach based on the Bayesian optimization framework.
arXiv Detail & Related papers (2025-03-09T01:38:21Z) - Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks [11.218066045459778]
This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function.<n>We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds for variance estimation under both heteroscedastic and homoscedastic settings.<n>We develop a ReLU network based robust bootstrap procedure for constructing confidence intervals for the true mean that comes with a theoretical guarantee on the coverage, providing a significant advancement in uncertainty quantification and the construction of reliable confidence intervals in deep learning settings.
arXiv Detail & Related papers (2024-12-29T05:17:58Z) - Semiparametric inference for impulse response functions using double/debiased machine learning [49.1574468325115]
We introduce a machine learning estimator for the impulse response function (IRF) in settings where a time series of interest is subjected to multiple discrete treatments.
The proposed estimator can rely on fully nonparametric relations between treatment and outcome variables, opening up the possibility to use flexible machine learning approaches to estimate IRFs.
arXiv Detail & Related papers (2024-11-15T07:42:02Z) - Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Mitigating LLM Hallucinations via Conformal Abstention [70.83870602967625]
We develop a principled procedure for determining when a large language model should abstain from responding in a general domain.
We leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate)
Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets.
arXiv Detail & Related papers (2024-04-04T11:32:03Z) - Doubly Robust Proximal Causal Learning for Continuous Treatments [56.05592840537398]
We propose a kernel-based doubly robust causal learning estimator for continuous treatments.
We show that its oracle form is a consistent approximation of the influence function.
We then provide a comprehensive convergence analysis in terms of the mean square error.
arXiv Detail & Related papers (2023-09-22T12:18:53Z) - Generalized Random Forests using Fixed-Point Trees [2.5944208050492183]
We propose a computationally efficient alternative to generalized random forests arXiv:1610.01271 (GRFs) for estimating heterogeneous effects in large dimensions.
While GRFs rely on a gradient-based splitting criterion, our method introduces a fixed-point approximation that eliminates the need for Jacobian estimation.
Our findings suggest that the proposed method is a scalable alternative for localized effect estimation in machine learning and causal inference applications.
arXiv Detail & Related papers (2023-06-20T21:45:35Z) - Semi-Parametric Inference for Doubly Stochastic Spatial Point Processes: An Approximate Penalized Poisson Likelihood Approach [3.085995273374333]
Doubly-stochastic point processes model the occurrence of events over a spatial domain as an inhomogeneous process conditioned on the realization of a random intensity function.
Existing implementations of doubly-stochastic spatial models are computationally demanding, often have limited theoretical guarantee, and/or rely on restrictive assumptions.
arXiv Detail & Related papers (2023-06-11T19:48:39Z) - Adaptive Conformal Prediction by Reweighting Nonconformity Score [0.0]
We use a Quantile Regression Forest (QRF) to learn the distribution of nonconformity scores and utilize the QRF's weights to assign more importance to samples with residuals similar to the test point.
Our approach enjoys an assumption-free finite sample marginal and training-conditional coverage, and under suitable assumptions, it also ensures conditional coverage.
arXiv Detail & Related papers (2023-03-22T16:42:19Z) - Confidence and Uncertainty Assessment for Distributional Random Forests [1.2767281330110625]
The Distributional Random Forest (DRF) is a recently introduced Random Forest to estimate conditional distributions.
It can be employed to estimate a wide range of targets such as conditional average treatment effects, conditional quantiles, and conditional correlations.
We characterize the algorithm of DRF and develop a bootstrap approximation of it.
arXiv Detail & Related papers (2023-02-11T19:10:01Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.