agtboost: Adaptive and Automatic Gradient Tree Boosting Computations
- URL: http://arxiv.org/abs/2008.12625v1
- Date: Fri, 28 Aug 2020 12:42:19 GMT
- Title: agtboost: Adaptive and Automatic Gradient Tree Boosting Computations
- Authors: Berent {\AA}nund Str{\o}mnes Lunde, Tore Selland Kleppe
- Abstract summary: agtboost implements fast gradient tree boosting computations.
A useful model validation function performs the Kolmogorov-Smirnov test on the learned distribution.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: agtboost is an R package implementing fast gradient tree boosting
computations in a manner similar to other established frameworks such as
xgboost and LightGBM, but with significant decreases in computation time and
required mathematical and technical knowledge. The package automatically takes
care of split/no-split decisions and selects the number of trees in the
gradient tree boosting ensemble, i.e., agtboost adapts the complexity of the
ensemble automatically to the information in the data. All of this is done
during a single training run, which is made possible by utilizing developments
in information theory for tree algorithms {\tt arXiv:2008.05926v1 [stat.ME]}.
agtboost also comes with a feature importance function that eliminates the
common practice of inserting noise features. Further, a useful model validation
function performs the Kolmogorov-Smirnov test on the learned distribution.
Related papers
- How to Boost Any Loss Function [63.573324901948716]
We show that any loss function can be optimized with boosting.
We also show that boosting can achieve a feat not yet known to be possible in the classical $0th$ order setting.
arXiv Detail & Related papers (2024-07-02T14:08:23Z) - A generalized decision tree ensemble based on the NeuralNetworks
architecture: Distributed Gradient Boosting Forest (DGBF) [0.0]
We present a graph-structured-tree-ensemble algorithm with a distributed representation learning process between trees naturally.
We call this novel approach Distributed Gradient Boosting Forest (DGBF) and we demonstrate that both RandomForest and GradientBoosting can be expressed as particular graph architectures of DGBF.
Finally, we see that the distributed learning outperforms both RandomForest and GradientBoosting in 7 out of 9 datasets.
arXiv Detail & Related papers (2024-02-04T09:22:52Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation.
Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - Structural Optimization Makes Graph Classification Simpler and Better [5.770986723520119]
We investigate the feasibility of improving graph classification performance while simplifying the model learning process.
Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees.
We present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification.
arXiv Detail & Related papers (2021-09-05T08:54:38Z) - To Boost or not to Boost: On the Limits of Boosted Neural Networks [67.67776094785363]
Boosting is a method for learning an ensemble of classifiers.
While boosting has been shown to be very effective for decision trees, its impact on neural networks has not been extensively studied.
We find that a single neural network usually generalizes better than a boosted ensemble of smaller neural networks with the same total number of parameters.
arXiv Detail & Related papers (2021-07-28T19:10:03Z) - Relational Boosted Regression Trees [1.14179290793997]
Many tasks use data housed in databases to train boosted regression tree models.
We give an adaptation of the greedyimation algorithm for training boosted regression trees.
arXiv Detail & Related papers (2021-07-25T20:29:28Z) - Boost-R: Gradient Boosted Trees for Recurrence Data [13.40931458200203]
This paper investigates an additive-tree-based approach, known as Boost-R (Boosting for Recurrence Data), for recurrent event data with both static and dynamic features.
Boost-R constructs an ensemble of gradient boosted additive trees to estimate the cumulative intensity function of the recurrent event process.
arXiv Detail & Related papers (2021-07-03T02:44:09Z) - Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning.
In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z) - BoostTree and BoostForest for Ensemble Learning [27.911350375268576]
BoostForest is an ensemble learning approach using BoostTree as base learners and can be used for both classification and regression.
It generally outperformed four classical ensemble learning approaches (Random Forest, Extra-Trees, XGBoost and LightGBM) on 35 classification and regression datasets.
arXiv Detail & Related papers (2020-03-21T19:52:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.