Margin Optimal Classification Trees
- URL: http://arxiv.org/abs/2210.10567v1
- Date: Wed, 19 Oct 2022 14:08:56 GMT
- Title: Margin Optimal Classification Trees
- Authors: Federico D'Onofrio, Giorgio Grani, Marta Monaci and Laura Palagi
- Abstract summary: We present a novel mixed-integer formulation for the Optimal Classification Tree ( OCT) problem.
Our model, denoted as Margin Optimal Classification Tree (MARGOT), exploits the generalization capabilities of Support Vector Machines for binary classification.
To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing local sparsity of the hyperplanes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years there has been growing attention to interpretable machine
learning models which can give explanatory insights on their behavior. Thanks
to their interpretability, decision trees have been intensively studied for
classification tasks, and due to the remarkable advances in mixed-integer
programming (MIP), various approaches have been proposed to formulate the
problem of training an Optimal Classification Tree (OCT) as a MIP model. We
present a novel mixed-integer quadratic formulation for the OCT problem, which
exploits the generalization capabilities of Support Vector Machines for binary
classification. Our model, denoted as Margin Optimal Classification Tree
(MARGOT), encompasses the use of maximum margin multivariate hyperplanes nested
in a binary tree structure. To enhance the interpretability of our approach, we
analyse two alternative versions of MARGOT, which include feature selection
constraints inducing local sparsity of the hyperplanes. First, MARGOT has been
tested on non-linearly separable synthetic datasets in 2-dimensional feature
space to provide a graphical representation of the maximum margin approach.
Finally, the proposed models have been tested on benchmark datasets from the
UCI repository. The MARGOT formulation turns out to be easier to solve than
other OCT approaches, and the generated tree better generalizes on new
observations. The two interpretable versions are effective in selecting the
most relevant features and maintaining good prediction quality.
Related papers
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees [0.0]
We propose two cut-based mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees.
Our models leverage on-the-fly identification of minimal infeasible subsystems (MISs) from which we derive cutting planes that hold the form of packing constraints.
arXiv Detail & Related papers (2024-08-02T14:37:28Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Mixed integer linear optimization formulations for learning optimal
binary classification trees [0.0]
We propose four mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees.
We conduct experiments on 13 publicly available datasets to show the models' ability to scale.
arXiv Detail & Related papers (2022-06-10T03:10:14Z) - Multiclass Optimal Classification Trees with SVM-splits [1.5039745292757671]
We present a novel mathematical optimization-based methodology to construct tree-shaped classification rules for multiclass instances.
Our approach consists of building Classification Trees in which, except for the leaf nodes, the labels are temporarily left out and grouped into two classes by means of a SVM separating hyperplane.
arXiv Detail & Related papers (2021-11-16T18:15:56Z) - Strong Optimal Classification Trees [8.10995244893652]
We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees.
Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees.
We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
arXiv Detail & Related papers (2021-03-29T21:40:58Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Robust Optimal Classification Trees under Noisy Labels [1.5039745292757671]
We propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample.
Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise.
arXiv Detail & Related papers (2020-12-15T19:12:29Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems.
We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.