Learning Optimal Classification Trees: Strong Max-Flow Formulations
- URL: http://arxiv.org/abs/2002.09142v2
- Date: Wed, 13 May 2020 02:24:34 GMT
- Title: Learning Optimal Classification Trees: Strong Max-Flow Formulations
- Authors: Sina Aghaei, Andres Gomez, Phebe Vayanos
- Abstract summary: We propose a flow-based MIP formulation for optimal binary classification trees with a stronger linear programming relaxation.
Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders' decomposition method.
- Score: 8.10995244893652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning optimal binary classification trees.
Literature on the topic has burgeoned in recent years, motivated both by the
empirical suboptimality of heuristic approaches and the tremendous improvements
in mixed-integer programming (MIP) technology. Yet, existing approaches from
the literature do not leverage the power of MIP to its full extent. Indeed,
they rely on weak formulations, resulting in slow convergence and large
optimality gaps. To fill this gap in the literature, we propose a flow-based
MIP formulation for optimal binary classification trees that has a stronger
linear programming relaxation. Our formulation presents an attractive
decomposable structure. We exploit this structure and max-flow/min-cut duality
to derive a Benders' decomposition method, which scales to larger instances. We
conduct extensive computational experiments on standard benchmark datasets on
which we show that our proposed approaches are 50 times faster than
state-of-the art MIP-based techniques and improve out of sample performance up
to 13.8%.
Related papers
- Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound [1.3654846342364308]
An optimal classification tree that maximizes training performance within a given size limit is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three.
We propose a novel algorithm that optimize trees directly on the continuous feature data using dynamic programming with branch-and-bound.
Our experiments demonstrate that these techniques improve runtime by one or more orders of magnitude over state-of-the-art optimal methods and improve test accuracy by 5% over greedys.
arXiv Detail & Related papers (2025-01-14T07:46:33Z) - Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - Rolling Lookahead Learning for Optimal Classification Trees [0.0]
We propose a rolling subtree lookahead algorithm that combines the relative scalability of the myopic approaches with the foresight of the optimal approaches in constructing trees.
We show that our approach outperforms optimal and myopic approaches in 808 out of 1330 problem instances, improving the out-of-sample accuracy by up to 23.6% and 14.4%, respectively.
arXiv Detail & Related papers (2023-04-21T09:17:00Z) - A Mathematical Programming Approach to Optimal Classification Forests [1.0499611180329806]
This paper introduces Weighted Optimal Classification Forests (WOCFs)
WOCFs take advantage of an optimal ensemble of decision trees to derive accurate and interpretable classifiers.
Overall, WOCFs complement existing methods such as CART, Optimal Classification Trees, Random Forests and XGBoost.
arXiv Detail & Related papers (2022-11-18T20:33:08Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees
with Continuous Features [5.663538370244174]
We present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees.
Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
arXiv Detail & Related papers (2022-06-23T17:19:29Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Strong Optimal Classification Trees [8.10995244893652]
We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees.
Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees.
We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
arXiv Detail & Related papers (2021-03-29T21:40:58Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.