On multivariate randomized classification trees: $l_0$-based sparsity,
VC~dimension and decomposition methods
- URL: http://arxiv.org/abs/2112.05239v1
- Date: Thu, 9 Dec 2021 22:49:08 GMT
- Title: On multivariate randomized classification trees: $l_0$-based sparsity,
VC~dimension and decomposition methods
- Authors: Edoardo Amaldi, Antonio Consolo, Andrea Manno
- Abstract summary: We investigate the nonlinear continuous optimization formulation proposed in Blanquero et al.
We first consider alternative methods to sparsify such trees based on concave approximations of the $l_0$ norm"
We propose a general decomposition scheme and an efficient version of it. Experiments on larger datasets show that the proposed decomposition method is able to significantly reduce the training times without compromising the accuracy.
- Score: 0.9346127431927981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision trees are widely-used classification and regression models because
of their interpretability and good accuracy. Classical methods such as CART are
based on greedy approaches but a growing attention has recently been devoted to
optimal decision trees. We investigate the nonlinear continuous optimization
formulation proposed in Blanquero et al. (EJOR, vol. 284, 2020; COR, vol. 132,
2021) for (sparse) optimal randomized classification trees. Sparsity is
important not only for feature selection but also to improve interpretability.
We first consider alternative methods to sparsify such trees based on concave
approximations of the $l_{0}$ ``norm". Promising results are obtained on 24
datasets in comparison with $l_1$ and $l_{\infty}$ regularizations. Then, we
derive bounds on the VC dimension of multivariate randomized classification
trees. Finally, since training is computationally challenging for large
datasets, we propose a general decomposition scheme and an efficient version of
it. Experiments on larger datasets show that the proposed decomposition method
is able to significantly reduce the training times without compromising the
accuracy.
Related papers
- Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method.
Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z) - An improved column-generation-based matheuristic for learning
classification trees [9.07661731728456]
Decision trees are highly interpretable models for solving classification problems in machine learning (ML)
Standard ML algorithms for training decision trees are fast but generate suboptimal trees in terms of accuracy.
citefirat 2020column proposed a column-generation-based approach for learning decision trees.
arXiv Detail & Related papers (2023-08-22T14:43:36Z) - Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise.
Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret.
We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z) - bsnsing: A decision tree induction method based on recursive optimal
boolean rule composition [2.28438857884398]
This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process.
It develops an efficient search solver that is able to solve practical instances faster than commercial solvers.
arXiv Detail & Related papers (2022-05-30T17:13:57Z) - Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning.
CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold.
This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier [68.38233199030908]
Long-tail recognition tackles the natural non-uniformly distributed data in realworld scenarios.
While moderns perform well on populated classes, its performance degrades significantly on tail classes.
Deep-RTC is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
arXiv Detail & Related papers (2020-07-20T05:57:42Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z) - Sparsity in Optimal Randomized Classification Trees [3.441021278275805]
We propose a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts.
Both types of sparsity, namely local and global, are modeled by means of regularizations with polyhedral norms.
Unlike greedy approaches, our ability to easily trade in some of our classification accuracy for a gain in global sparsity is shown.
arXiv Detail & Related papers (2020-02-21T09:09:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.