Learning Optimal Classification Trees: Strong Max-Flow Formulations
- URL: http://arxiv.org/abs/2002.09142v2
- Date: Wed, 13 May 2020 02:24:34 GMT
- Title: Learning Optimal Classification Trees: Strong Max-Flow Formulations
- Authors: Sina Aghaei, Andres Gomez, Phebe Vayanos
- Abstract summary: We propose a flow-based MIP formulation for optimal binary classification trees with a stronger linear programming relaxation.
Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders' decomposition method.
- Score: 8.10995244893652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning optimal binary classification trees.
Literature on the topic has burgeoned in recent years, motivated both by the
empirical suboptimality of heuristic approaches and the tremendous improvements
in mixed-integer programming (MIP) technology. Yet, existing approaches from
the literature do not leverage the power of MIP to its full extent. Indeed,
they rely on weak formulations, resulting in slow convergence and large
optimality gaps. To fill this gap in the literature, we propose a flow-based
MIP formulation for optimal binary classification trees that has a stronger
linear programming relaxation. Our formulation presents an attractive
decomposable structure. We exploit this structure and max-flow/min-cut duality
to derive a Benders' decomposition method, which scales to larger instances. We
conduct extensive computational experiments on standard benchmark datasets on
which we show that our proposed approaches are 50 times faster than
state-of-the art MIP-based techniques and improve out of sample performance up
to 13.8%.
Related papers
- Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models [24.185245582500876]
We introduce FISTAPruner, the first post-training pruner based on convex optimization models and algorithms.
FISTAPruner incorporates an intra-layer cumulative error correction mechanism and supports parallel pruning.
We evaluate FISTAPruner on models such as OPT, LLaMA, LLaMA-2, and LLaMA-3 with 125M to 70B parameters under unstructured and 2:4 semi-structured sparsity.
arXiv Detail & Related papers (2024-08-07T12:33:46Z) - Rolling Lookahead Learning for Optimal Classification Trees [0.0]
We propose a rolling subtree lookahead algorithm that combines the relative scalability of the myopic approaches with the foresight of the optimal approaches in constructing trees.
We show that our approach outperforms optimal and myopic approaches in 808 out of 1330 problem instances, improving the out-of-sample accuracy by up to 23.6% and 14.4%, respectively.
arXiv Detail & Related papers (2023-04-21T09:17:00Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees
with Continuous Features [5.663538370244174]
We present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees.
Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
arXiv Detail & Related papers (2022-06-23T17:19:29Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Strong Optimal Classification Trees [8.10995244893652]
We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees.
Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees.
We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
arXiv Detail & Related papers (2021-03-29T21:40:58Z) - Optimal Decision Trees for Nonlinear Metrics [42.18286681448184]
We show a novel algorithm for producing optimal trees for nonlinear metrics.
To the best of our knowledge, this is the first method to compute provably optimal decision trees for nonlinear metrics.
Our approach leads to a trade-off when compared to optimising linear metrics.
arXiv Detail & Related papers (2020-09-15T08:30:56Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.