Stochastic Tree Ensembles for Estimating Heterogeneous Effects
- URL: http://arxiv.org/abs/2209.06998v1
- Date: Thu, 15 Sep 2022 01:58:03 GMT
- Title: Stochastic Tree Ensembles for Estimating Heterogeneous Effects
- Authors: Nikolay Krantsevich, Jingyu He, P. Richard Hahn
- Abstract summary: This paper develops a novel algorithm for fitting the BCF model, which is more efficient than the previously available Gibbs sampler.
The new algorithm is compared to related approaches via simulation studies as well as an empirical analysis.
- Score: 0.7734726150561088
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Determining subgroups that respond especially well (or poorly) to specific
interventions (medical or policy) requires new supervised learning methods
tailored specifically for causal inference. Bayesian Causal Forest (BCF) is a
recent method that has been documented to perform well on data generating
processes with strong confounding of the sort that is plausible in many
applications. This paper develops a novel algorithm for fitting the BCF model,
which is more efficient than the previously available Gibbs sampler. The new
algorithm can be used to initialize independent chains of the existing Gibbs
sampler leading to better posterior exploration and coverage of the associated
interval estimates in simulation studies. The new algorithm is compared to
related approaches via simulation studies as well as an empirical analysis.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data [0.0]
This research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy.
Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS.
arXiv Detail & Related papers (2022-11-28T21:00:55Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - On Uncertainty Estimation by Tree-based Surrogate Models in Sequential
Model-based Optimization [13.52611859628841]
We revisit various ensembles of randomized trees to investigate their behavior in the perspective of prediction uncertainty estimation.
We propose a new way of constructing an ensemble of randomized trees, referred to as BwO forest, where bagging with oversampling is employed to construct bootstrapped samples.
Experimental results demonstrate the validity and good performance of BwO forest over existing tree-based models in various circumstances.
arXiv Detail & Related papers (2022-02-22T04:50:37Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Binary Classification of Gaussian Mixtures: Abundance of Support
Vectors, Benign Overfitting and Regularization [39.35822033674126]
We study binary linear classification under a generative Gaussian mixture model.
We derive novel non-asymptotic bounds on the classification error of the latter.
Our results extend to a noisy model with constant probability noise flips.
arXiv Detail & Related papers (2020-11-18T07:59:55Z) - Minimax Quasi-Bayesian estimation in sparse canonical correlation
analysis via a Rayleigh quotient function [1.0878040851638]
Existing rate-optimal estimators for sparse canonical vectors have high computational cost.
We propose a quasi-Bayesian estimation procedure that achieves the minimax estimation rate.
We use the proposed methodology to maximally correlate clinical variables and proteomic data for better understanding the Covid-19 disease.
arXiv Detail & Related papers (2020-10-16T21:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.