Shrub Ensembles for Online Classification
- URL: http://arxiv.org/abs/2112.03723v1
- Date: Tue, 7 Dec 2021 14:22:43 GMT
- Title: Shrub Ensembles for Online Classification
- Authors: Sebastian Buschj\"ager, Sibylle Hess, Katharina Morik
- Abstract summary: Decision Tree (DT) ensembles provide excellent performance while adapting to changes in the data, but they are not resource efficient.
We propose a novel memory-efficient online classification ensemble called shrub ensembles for resource-constraint systems.
Our algorithm trains small to medium-sized decision trees on small windows and uses gradient descent to learn the ensemble weights of these shrubs'
- Score: 7.057937612386993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online learning algorithms have become a ubiquitous tool in the machine
learning toolbox and are frequently used in small, resource-constraint
environments. Among the most successful online learning methods are Decision
Tree (DT) ensembles. DT ensembles provide excellent performance while adapting
to changes in the data, but they are not resource efficient. Incremental tree
learners keep adding new nodes to the tree but never remove old ones increasing
the memory consumption over time. Gradient-based tree learning, on the other
hand, requires the computation of gradients over the entire tree which is
costly for even moderately sized trees. In this paper, we propose a novel
memory-efficient online classification ensemble called shrub ensembles for
resource-constraint systems. Our algorithm trains small to medium-sized
decision trees on small windows and uses stochastic proximal gradient descent
to learn the ensemble weights of these `shrubs'. We provide a theoretical
analysis of our algorithm and include an extensive discussion on the behavior
of our approach in the online setting. In a series of 2~959 experiments on 12
different datasets, we compare our method against 8 state-of-the-art methods.
Our Shrub Ensembles retain an excellent performance even when only little
memory is available. We show that SE offers a better accuracy-memory trade-off
in 7 of 12 cases, while having a statistically significant better performance
than most other methods. Our implementation is available under
https://github.com/sbuschjaeger/se-online .
Related papers
- LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent [10.27211960475599]
Decision Trees (DTs) are commonly used for many machine learning tasks.
In this paper, we propose a novel approach to learn DTs using a greedy algorithm.
We propose backpropagation with a straight-through operator on a dense DT representation, to jointly optimize all tree parameters.
arXiv Detail & Related papers (2023-05-05T13:24:35Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Dive into Decision Trees and Forests: A Theoretical Demonstration [0.0]
Decision trees use the strategy of "divide-and-conquer" to divide a complex problem on the dependency between input features and labels into smaller ones.
Recent advances have greatly improved their performance in computational advertising, recommender system, information retrieval, etc.
arXiv Detail & Related papers (2021-01-20T16:47:59Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Learning Binary Decision Trees by Argmin Differentiation [34.9154848754842]
We learn binary decision trees that partition data for some downstream task.
We do so by relaxing a mixed-integer program for the discrete parameters.
We derive customized algorithms to efficiently compute the forward and backward passes.
arXiv Detail & Related papers (2020-10-09T15:11:28Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - The Tree Ensemble Layer: Differentiability meets Conditional Computation [8.40843862024745]
We introduce a new layer for neural networks composed of an ensemble of differentiable decision trees (a.k.a. soft trees)
Differentiable trees demonstrate promising results in the literature, but are typically slow in training and inference as they do not support conditional computation.
We develop specialized forward and backward propagation algorithms that exploit sparsity.
arXiv Detail & Related papers (2020-02-18T18:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.