Mixed integer linear optimization formulations for learning optimal
binary classification trees
- URL: http://arxiv.org/abs/2206.04857v2
- Date: Sun, 9 Jul 2023 06:34:58 GMT
- Title: Mixed integer linear optimization formulations for learning optimal
binary classification trees
- Authors: Brandon Alston, Hamidreza Validi, Illya V. Hicks
- Abstract summary: We propose four mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees.
We conduct experiments on 13 publicly available datasets to show the models' ability to scale.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision trees are powerful tools for classification and regression that
attract many researchers working in the burgeoning area of machine learning.
One advantage of decision trees over other methods is their interpretability,
which is often preferred over other higher accuracy methods that are relatively
uninterpretable. A binary classification tree has two types of vertices: (i)
branching vertices which have exactly two children and where datapoints are
assessed on a set of discrete features; and (ii) leaf vertices at which
datapoints are given a discrete prediction. An optimal binary classification
tree can be obtained by solving a biobjective optimization problem that seeks
to (i) maximize the number of correctly classified datapoints and (ii) minimize
the number of branching vertices. In this paper, we propose four mixed integer
linear optimization (MILO) formulations for designing optimal binary
classification trees: two flow-based formulations and two-cut based
formulations. We provide theoretical comparisons between our proposed
formulations and the strongest flow-based MILO formulation of Aghaei et al.
(2021). We conduct experiments on 13 publicly available datasets to show the
models' ability to scale and the strength of a biobjective approach using
Pareto frontiers. Our code and data are available on GitHub.
Related papers
- Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees [0.0]
We propose two cut-based mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees.
Our models leverage on-the-fly identification of minimal infeasible subsystems (MISs) from which we derive cutting planes that hold the form of packing constraints.
arXiv Detail & Related papers (2024-08-02T14:37:28Z) - Margin Optimal Classification Trees [0.0]
We present a novel mixed-integer formulation for the Optimal Classification Tree ( OCT) problem.
Our model, denoted as Margin Optimal Classification Tree (MARGOT), exploits the generalization capabilities of Support Vector Machines for binary classification.
To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing local sparsity of the hyperplanes.
arXiv Detail & Related papers (2022-10-19T14:08:56Z) - Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees
with Continuous Features [5.663538370244174]
We present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees.
Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
arXiv Detail & Related papers (2022-06-23T17:19:29Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Strong Optimal Classification Trees [8.10995244893652]
We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees.
Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees.
We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
arXiv Detail & Related papers (2021-03-29T21:40:58Z) - Optimal Decision Trees for Nonlinear Metrics [42.18286681448184]
We show a novel algorithm for producing optimal trees for nonlinear metrics.
To the best of our knowledge, this is the first method to compute provably optimal decision trees for nonlinear metrics.
Our approach leads to a trade-off when compared to optimising linear metrics.
arXiv Detail & Related papers (2020-09-15T08:30:56Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.