Related papers: agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

URL: http://arxiv.org/abs/2008.12625v1
Date: Fri, 28 Aug 2020 12:42:19 GMT
Title: agtboost: Adaptive and Automatic Gradient Tree Boosting Computations
Authors: Berent {\AA}nund Str{\o}mnes Lunde, Tore Selland Kleppe
Abstract summary: agtboost implements fast gradient tree boosting computations. A useful model validation function performs the Kolmogorov-Smirnov test on the learned distribution.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: agtboost is an R package implementing fast gradient tree boosting computations in a manner similar to other established frameworks such as xgboost and LightGBM, but with significant decreases in computation time and required mathematical and technical knowledge. The package automatically takes care of split/no-split decisions and selects the number of trees in the gradient tree boosting ensemble, i.e., agtboost adapts the complexity of the ensemble automatically to the information in the data. All of this is done during a single training run, which is made possible by utilizing developments in information theory for tree algorithms {\tt arXiv:2008.05926v1 [stat.ME]}. agtboost also comes with a feature importance function that eliminates the common practice of inserting noise features. Further, a useful model validation function performs the Kolmogorov-Smirnov test on the learned distribution.

Related papers

OGBoost: A Python Package for Ordinal Gradient Boosting [0.0]
We introduce OGBoost, a scikit-learn-compatible Python package for ordinal regression using gradient boosting. The package is available on PyPI and can be installed via "pip install ogboost"
arXiv Detail & Related papers (2025-02-19T06:06:12Z)
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls [83.89771461061903]
Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs) Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs) We identify two key challenges contributing to this inefficiency: $textitover-exploration$ due to redundant states with semantically equivalent content, and $textitunder-exploration$ caused by high variance in verifier scoring. We propose FETCH, a flexible, plug-and-play system compatible with various tree search algorithms.
arXiv Detail & Related papers (2025-02-16T16:12:01Z)
NRGBoost: Energy-Based Generative Boosted Trees [1.0878040851638]
We propose an energy-based generative boosting algorithm that is analogous to the second-order boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT. At the same time, we show that it is also competitive with neural-network-based models for sampling.
arXiv Detail & Related papers (2024-10-04T15:54:02Z)
How to Boost Any Loss Function [63.573324901948716]
We show that any loss function can be optimized with boosting. We also show that boosting can achieve a feat not yet known to be possible in the classical $0th$ order setting.
arXiv Detail & Related papers (2024-07-02T14:08:23Z)
A generalized decision tree ensemble based on the NeuralNetworks architecture: Distributed Gradient Boosting Forest (DGBF) [0.0]
We present a graph-structured-tree-ensemble algorithm with a distributed representation learning process between trees naturally. We call this novel approach Distributed Gradient Boosting Forest (DGBF) and we demonstrate that both RandomForest and GradientBoosting can be expressed as particular graph architectures of DGBF. Finally, we see that the distributed learning outperforms both RandomForest and GradientBoosting in 7 out of 9 datasets.
arXiv Detail & Related papers (2024-02-04T09:22:52Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)
Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z)
Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$. Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z)
Structural Optimization Makes Graph Classification Simpler and Better [5.770986723520119]
We investigate the feasibility of improving graph classification performance while simplifying the model learning process. Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees. We present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification.
arXiv Detail & Related papers (2021-09-05T08:54:38Z)
To Boost or not to Boost: On the Limits of Boosted Neural Networks [67.67776094785363]
Boosting is a method for learning an ensemble of classifiers. While boosting has been shown to be very effective for decision trees, its impact on neural networks has not been extensively studied. We find that a single neural network usually generalizes better than a boosted ensemble of smaller neural networks with the same total number of parameters.
arXiv Detail & Related papers (2021-07-28T19:10:03Z)
Relational Boosted Regression Trees [1.14179290793997]
Many tasks use data housed in databases to train boosted regression tree models. We give an adaptation of the greedyimation algorithm for training boosted regression trees.
arXiv Detail & Related papers (2021-07-25T20:29:28Z)
Boost-R: Gradient Boosted Trees for Recurrence Data [13.40931458200203]
This paper investigates an additive-tree-based approach, known as Boost-R (Boosting for Recurrence Data), for recurrent event data with both static and dynamic features. Boost-R constructs an ensemble of gradient boosted additive trees to estimate the cumulative intensity function of the recurrent event process.
arXiv Detail & Related papers (2021-07-03T02:44:09Z)
Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning. In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z)
BoostTree and BoostForest for Ensemble Learning [27.911350375268576]
BoostForest is an ensemble learning approach using BoostTree as base learners and can be used for both classification and regression. It generally outperformed four classical ensemble learning approaches (Random Forest, Extra-Trees, XGBoost and LightGBM) on 35 classification and regression datasets.
arXiv Detail & Related papers (2020-03-21T19:52:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.