Strong Optimal Classification Trees
- URL: http://arxiv.org/abs/2103.15965v3
- Date: Tue, 18 Jul 2023 20:59:55 GMT
- Title: Strong Optimal Classification Trees
- Authors: Sina Aghaei, Andr\'es G\'omez, Phebe Vayanos
- Abstract summary: We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees.
Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees.
We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
- Score: 8.10995244893652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision trees are among the most popular machine learning models and are
used routinely in applications ranging from revenue management and medicine to
bioinformatics. In this paper, we consider the problem of learning optimal
binary classification trees with univariate splits. Literature on the topic has
burgeoned in recent years, motivated both by the empirical suboptimality of
heuristic approaches and the tremendous improvements in mixed-integer
optimization (MIO) technology. Yet, existing MIO-based approaches from the
literature do not leverage the power of MIO to its full extent: they rely on
weak formulations, resulting in slow convergence and large optimality gaps. To
fill this gap in the literature, we propose an intuitive flow-based MIO
formulation for learning optimal binary classification trees. Our formulation
can accommodate side constraints to enable the design of interpretable and fair
decision trees. Moreover, we show that our formulation has a stronger linear
optimization relaxation than existing methods in the case of binary data. We
exploit the decomposable structure of our formulation and max-flow/min-cut
duality to derive a Benders' decomposition method to speed-up computation. We
propose a tailored procedure for solving each decomposed subproblem that
provably generates facets of the feasible set of the MIO as constraints to add
to the main problem. We conduct extensive computational experiments on standard
benchmark datasets on which we show that our proposed approaches are 29 times
faster than state-of-the-art MIO-based techniques and improve out-of-sample
performance by up to 8%.
Related papers
- Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - Advancing Model Pruning via Bi-level Optimization [89.88761425199598]
iterative magnitude pruning (IMP) is the predominant pruning method to successfully find 'winning tickets'
One-shot pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.
We show that the proposed bi-level optimization-oriented pruning method (termed BiP) is a special class of BLO problems with a bi-linear problem structure.
arXiv Detail & Related papers (2022-10-08T19:19:29Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees
with Continuous Features [5.663538370244174]
We present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees.
Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
arXiv Detail & Related papers (2022-06-23T17:19:29Z) - Mixed integer linear optimization formulations for learning optimal
binary classification trees [0.0]
We propose four mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees.
We conduct experiments on 13 publicly available datasets to show the models' ability to scale.
arXiv Detail & Related papers (2022-06-10T03:10:14Z) - bsnsing: A decision tree induction method based on recursive optimal
boolean rule composition [2.28438857884398]
This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process.
It develops an efficient search solver that is able to solve practical instances faster than commercial solvers.
arXiv Detail & Related papers (2022-05-30T17:13:57Z) - Robust Optimal Classification Trees Against Adversarial Examples [5.254093731341154]
We propose a collection of methods to train decision trees that are optimally robust against user-specified attack models.
We show that the min-max optimization problem that arises in adversarial learning can be solved using a single minimization formulation.
We also present a method that determines the upper bound on adversarial accuracy for any model using bipartite matching.
arXiv Detail & Related papers (2021-09-08T18:10:49Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z) - Learning Optimal Classification Trees: Strong Max-Flow Formulations [8.10995244893652]
We propose a flow-based MIP formulation for optimal binary classification trees with a stronger linear programming relaxation.
Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders' decomposition method.
arXiv Detail & Related papers (2020-02-21T05:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.