Related papers: A Mathematical Programming Approach to Optimal Classification Forests

A Mathematical Programming Approach to Optimal Classification Forests

URL: http://arxiv.org/abs/2211.10502v3
Date: Fri, 29 Nov 2024 12:48:55 GMT
Title: A Mathematical Programming Approach to Optimal Classification Forests
Authors: Víctor Blanco, Alberto Japón, Justo Puerto, Peter Zhang,
Abstract summary: This paper introduces Weighted Optimal Classification Forests (WOCFs)<n>WOCFs take advantage of an optimal ensemble of decision trees to derive accurate and interpretable classifiers.<n>Overall, WOCFs complement existing methods such as CART, Optimal Classification Trees, Random Forests and XGBoost.
Score: 1.0499611180329806
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper introduces Weighted Optimal Classification Forests (WOCFs), a new family of classifiers that takes advantage of an optimal ensemble of decision trees to derive accurate and interpretable classifiers. We propose a novel mathematical optimization-based methodology which simultaneously constructs a given number of trees, each of them providing a predicted class for the observations in the feature space. The classification rule is derived by assigning to each observation its most frequently predicted class among the trees. We provide a mixed integer linear programming formulation (MIP) for the problem and several novel MIP strengthening / scaling techniques. We report the results of our computational experiments, from which we conclude that our method has equal or superior performance compared with state-of-the-art tree-based classification methods for small to medium-sized instances. We also present three real-world case studies showing that our methodology has very interesting implications in terms of interpretability. Overall, WOCFs complement existing methods such as CART, Optimal Classification Trees, Random Forests and XGBoost. In addition to its Pareto improvement on accuracy and interpretability, we also see unique properties emerging in terms of different trees focusing on different feature variables. This provides nontrivial improvement in interpretability and usability of the trained model in terms of counterfactual explanation. Thus, despite the apparent computational challenge of WOCFs that limit the size of the problems that can be efficiently solved with current MIP, this is an important research direction that can lead to qualitatively different insights for researchers and complement the toolbox of practitioners for high stakes problems.

Related papers

Decision Tree Induction Through LLMs via Semantically-Aware Evolution [53.0367886783772]
We propose an evolutionary optimization method for decision tree induction based on genetic programming (GP) Our key innovation is the integration of semantic priors and domain-specific knowledge about the search space into the algorithm. This is operationalized through novel genetic operators that work with structured natural language prompts.
arXiv Detail & Related papers (2025-03-18T12:52:03Z)
Enriched Functional Tree-Based Classifiers: A Novel Approach Leveraging Derivatives and Geometric Features [0.0]
This study introduces an advanced methodology for supervised classification by integrating Functional Data Analysis (FDA) with tree-based ensemble techniques for classifying high-dimensional time series.
arXiv Detail & Related papers (2024-09-26T12:57:47Z)
Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation. DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level. To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles. We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z)
Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z)
Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z)
Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules. We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z)
Multiclass Optimal Classification Trees with SVM-splits [1.5039745292757671]
We present a novel mathematical optimization-based methodology to construct tree-shaped classification rules for multiclass instances. Our approach consists of building Classification Trees in which, except for the leaf nodes, the labels are temporarily left out and grouped into two classes by means of a SVM separating hyperplane.
arXiv Detail & Related papers (2021-11-16T18:15:56Z)
Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Making CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy Learning [62.82046926149371]
We propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable. We achieve this by building a differentiable decision forest on top of CNNs. We name the transferred model deep Dynamic Sequential Decision Forest (dDSDF)
arXiv Detail & Related papers (2021-06-05T07:41:18Z)
Strong Optimal Classification Trees [8.10995244893652]
We propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. We show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques.
arXiv Detail & Related papers (2021-03-29T21:40:58Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm. We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
Sparsity in Optimal Randomized Classification Trees [3.441021278275805]
We propose a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts. Both types of sparsity, namely local and global, are modeled by means of regularizations with polyhedral norms. Unlike greedy approaches, our ability to easily trade in some of our classification accuracy for a gain in global sparsity is shown.
arXiv Detail & Related papers (2020-02-21T09:09:59Z)
Learning Optimal Classification Trees: Strong Max-Flow Formulations [8.10995244893652]
We propose a flow-based MIP formulation for optimal binary classification trees with a stronger linear programming relaxation. Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders' decomposition method.
arXiv Detail & Related papers (2020-02-21T05:58:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.