Related papers: Very fast Bayesian Additive Regression Trees on GPU

Very fast Bayesian Additive Regression Trees on GPU

URL: http://arxiv.org/abs/2410.23244v1
Date: Wed, 30 Oct 2024 17:29:03 GMT
Title: Very fast Bayesian Additive Regression Trees on GPU
Authors: Giacomo Petrillo,
Abstract summary: I present a GPU-enabled implementation of BART, faster by up to 200x relative to a single CPU core, making BART competitive in running time with XGBoost. This implementation is available in the Python package bartz.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression technique based on an ensemble of decision trees. It is part of the toolbox of many statisticians. The overall statistical quality of the regression is typically higher than other generic alternatives, and it requires less manual tuning, making it a good default choice. However, it is a niche method compared to its natural competitor XGBoost, due to the longer running time, making sample sizes above 10,000-100,000 a nuisance. I present a GPU-enabled implementation of BART, faster by up to 200x relative to a single CPU core, making BART competitive in running time with XGBoost. This implementation is available in the Python package bartz.

Related papers

BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems [56.16884466478886]
BurTorch is a compact high-performance framework designed to optimize Deep Learning (DL) training on single-node workstations. BurTorch adopts a minimalist design and demonstrates that, in these circumstances, classical compiled programming languages can play a significant role in DL research.
arXiv Detail & Related papers (2025-03-18T00:52:12Z)
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding [64.2888389315149]
Test-time scaling improves large language model performance by adding extra compute during decoding. Best-of-N sampling serves as a common scaling technique, broadening the search space for finding better solutions. We propose Self-Truncation Best-of-N (ST-BoN), a novel decoding method that avoids fully generating all samplings.
arXiv Detail & Related papers (2025-03-03T11:21:01Z)
On the Gaussian process limit of Bayesian Additive Regression Trees [0.0]
Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression technique of rising fame. In the limit of infinite trees, it becomes equivalent to Gaussian process (GP) regression. This study opens new ways to understand and develop BART and GP regression.
arXiv Detail & Related papers (2024-10-26T23:18:33Z)
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search [68.66904039405871]
We introduce SoftTreeMax, a generalization of softmax that takes planning into account. We show for the first time the role of a tree expansion policy in mitigating this variance. Our differentiable tree-based policy leverages all gradients at the tree leaves in each environment step instead of the traditional single-sample-based gradient.
arXiv Detail & Related papers (2023-01-30T19:03:14Z)
pGMM Kernel Regression and Comparisons with Boosted Trees [21.607059258448594]
In this work, we demonstrate the advantage of the pGMM kernel in the context of (ridge) regression. Perhaps surprisingly, even without a tuning parameter (i.e., $p=1$ for the power parameter of the pGMM kernel), the pGMM kernel already performs well. Perhaps also surprisingly, the best performance (in terms of $L$ regression loss) is often attained at $p>2$, in some cases at $pgggg 2$.
arXiv Detail & Related papers (2022-07-18T15:06:30Z)
Data-Efficient Instance Segmentation with a Single GPU [88.31338435907304]
We introduce a data-efficient segmentation method we used in the 2021 VIPriors Instance Challenge. Our solution is a modified version of Swin Transformer, based on the mmdetection which is a powerful toolbox. Our method achieved the AP@0.50:0.95 (medium) of 0.592, which ranks second among all contestants.
arXiv Detail & Related papers (2021-10-01T07:36:20Z)
Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning. In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z)
IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings. We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data. We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z)
An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs) We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach. Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z)
Bayesian Additive Regression Trees with Model Trees [0.0]
We introduce an extension of BART, called Model Trees BART (MOTR-BART) MOTR-BART considers piecewise linear functions at node levels instead of piecewise constants. In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART.
arXiv Detail & Related papers (2020-06-12T22:19:58Z)
Survival regression with accelerated failure time model in XGBoost [1.5469452301122177]
Survival regression is used to estimate the relation between time-to-event and feature variables. XGBoost implements loss functions for learning accelerated failure time models.
arXiv Detail & Related papers (2020-06-08T20:34:20Z)
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification [119.41129787351092]
We introduce BBKB, the first no-regret GP optimization algorithm that provably runs in near-linear time and selects candidates in batches. We show that the same bound can be used to adaptively delay costly updates to the sparse GP approximation, achieving a near-constant per-step amortized cost.
arXiv Detail & Related papers (2020-02-23T17:43:29Z)
Stochastic tree ensembles for regularized nonlinear regression [0.913755431537592]
This paper develops a novel tree ensemble method for nonlinear regression, which we refer to as XBART. By combining regularization and search strategies from Bayesian modeling with computationally efficient techniques, the new method attains state-of-the-art performance.
arXiv Detail & Related papers (2020-02-09T14:37:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.