A Mixing Time Lower Bound for a Simplified Version of BART
- URL: http://arxiv.org/abs/2210.09352v1
- Date: Mon, 17 Oct 2022 18:45:36 GMT
- Title: A Mixing Time Lower Bound for a Simplified Version of BART
- Authors: Omer Ronen, Theo Saarinen, Yan Shuo Tan, James Duncan and Bin Yu
- Abstract summary: We provide the first lower bound on the mixing time for a simplified version of BART.
Inspired by this new connection between the mixing time and the number of data points, we perform rigorous simulations on BART.
We show qualitatively that BART's mixing time increases with the number of data points.
- Score: 5.149859291357858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian Additive Regression Trees (BART) is a popular Bayesian
non-parametric regression algorithm. The posterior is a distribution over sums
of decision trees, and predictions are made by averaging approximate samples
from the posterior.
The combination of strong predictive performance and the ability to provide
uncertainty measures has led BART to be commonly used in the social sciences,
biostatistics, and causal inference.
BART uses Markov Chain Monte Carlo (MCMC) to obtain approximate posterior
samples over a parameterized space of sums of trees, but it has often been
observed that the chains are slow to mix.
In this paper, we provide the first lower bound on the mixing time for a
simplified version of BART in which we reduce the sum to a single tree and use
a subset of the possible moves for the MCMC proposal distribution. Our lower
bound for the mixing time grows exponentially with the number of data points.
Inspired by this new connection between the mixing time and the number of
data points, we perform rigorous simulations on BART. We show qualitatively
that BART's mixing time increases with the number of data points.
The slow mixing time of the simplified BART suggests a large variation
between different runs of the simplified BART algorithm and a similar large
variation is known for BART in the literature. This large variation could
result in a lack of stability in the models, predictions, and posterior
intervals obtained from the BART MCMC samples.
Our lower bound and simulations suggest increasing the number of chains with
the number of data points.
Related papers
- The Computational Curse of Big Data for Bayesian Additive Regression Trees: A Hitting Time Analysis [8.36826153664925]
We show that the BART sampler often converges slowly, confirming empirical observations by other researchers.
As $n$ increases, the approximate BART posterior becomes increasingly different from the exact posterior.
arXiv Detail & Related papers (2024-06-28T14:45:29Z) - ASBART:Accelerated Soft Bayes Additive Regression Trees [8.476756500467689]
Soft BART improves both practically and heoretically on existing Bayesian sum-of-trees models.
Compared to BART,it use more than about 20 times to complete the calculation with the default setting.
We proposed a variant of BART named accelerate Soft BART(ASBART)
arXiv Detail & Related papers (2023-10-21T11:27:42Z) - On Mixing Rates for Bayesian CART [0.0]
This work focuses on the Bayesian CART algorithm which forms a building block of Bayesian Additive Regression Trees (BART)
We derive upper bounds on mixing times for typical posteriors under various proposal distributions.
A thorough simulation study highlights discrepancies between spike-and-slab priors and Bayesian CART under a variety of proposals.
arXiv Detail & Related papers (2023-05-31T19:04:28Z) - Bayesian Decision Trees Inspired from Evolutionary Algorithms [64.80360020499555]
We propose a replacement of the Markov Chain Monte Carlo (MCMC) with an inherently parallel algorithm, the Sequential Monte Carlo (SMC)
Experiments show that SMC combined with the Evolutionary Algorithms (EA) can produce more accurate results compared to MCMC in 100 times fewer iterations.
arXiv Detail & Related papers (2023-05-30T06:17:35Z) - An analysis of over-sampling labeled data in semi-supervised learning
with FixMatch [66.34968300128631]
Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches.
This paper studies whether this common practice improves learning and how.
We compare it to an alternative setting where each mini-batch is uniformly sampled from all the training data, labeled or not.
arXiv Detail & Related papers (2022-01-03T12:22:26Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Accounting for shared covariates in semi-parametric Bayesian additive regression trees [0.0]
We propose some extensions to semi-parametric models based on Bayesian additive regression trees (BART)
The main novelty in our approach lies in the way we change the tree-generation moves in BART to deal with this bias.
We show competitive performance when compared to regression models, alternative formulations of semi-parametric BART, and other tree-based methods.
arXiv Detail & Related papers (2021-08-17T13:58:44Z) - Covariance-Free Sparse Bayesian Learning [62.24008859844098]
We introduce a new SBL inference algorithm that avoids explicit inversions of the covariance matrix.
Our method can be up to thousands of times faster than existing baselines.
We showcase how our new algorithm enables SBL to tractably tackle high-dimensional signal recovery problems.
arXiv Detail & Related papers (2021-05-21T16:20:07Z) - Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs)
We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics.
Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z) - Bayesian Additive Regression Trees with Model Trees [0.0]
We introduce an extension of BART, called Model Trees BART (MOTR-BART)
MOTR-BART considers piecewise linear functions at node levels instead of piecewise constants.
In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART.
arXiv Detail & Related papers (2020-06-12T22:19:58Z) - Cross-Iteration Batch Normalization [67.83430009388678]
We present Cross-It Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality.
CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique.
arXiv Detail & Related papers (2020-02-13T18:52:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.