ODTlearn: A Package for Learning Optimal Decision Trees for Prediction
and Prescription
- URL: http://arxiv.org/abs/2307.15691v2
- Date: Mon, 13 Nov 2023 01:56:51 GMT
- Title: ODTlearn: A Package for Learning Optimal Decision Trees for Prediction
and Prescription
- Authors: Patrick Vossler, Sina Aghaei, Nathan Justin, Nathanael Jo, Andr\'es
G\'omez, Phebe Vayanos
- Abstract summary: ODTLearn is an open-source Python package for learning optimal decision trees.
It provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks.
- Score: 3.293021585117505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ODTLearn is an open-source Python package that provides methods for learning
optimal decision trees for high-stakes predictive and prescriptive tasks based
on the mixed-integer optimization (MIO) framework proposed in Aghaei et al.
(2019) and several of its extensions. The current version of the package
provides implementations for learning optimal classification trees, optimal
fair classification trees, optimal classification trees robust to distribution
shifts, and optimal prescriptive trees from observational data. We have
designed the package to be easy to maintain and extend as new optimal decision
tree problem classes, reformulation strategies, and solution algorithms are
introduced. To this end, the package follows object-oriented design principles
and supports both commercial (Gurobi) and open source (COIN-OR branch and cut)
solvers. The package documentation and an extensive user guide can be found at
https://d3m-research-group.github.io/odtlearn/. Additionally, users can view
the package source code and submit feature requests and bug reports by visiting
https://github.com/D3M-Research-Group/odtlearn.
Related papers
- Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation.
DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level.
To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - BooleanOCT: Optimal Classification Trees based on multivariate Boolean
Rules [14.788278997556606]
We introduce a new mixed-integer programming (MIP) formulation to derive the optimal classification tree.
Our methodology integrates both linear metrics, including accuracy, balanced accuracy, and cost-sensitive cost, as well as nonlinear metrics such as the F1-score.
The proposed models demonstrate practical solvability on real-world datasets, effectively handling sizes in the tens of thousands.
arXiv Detail & Related papers (2024-01-29T12:58:44Z) - BackboneLearn: A Library for Scaling Mixed-Integer Optimization-Based
Machine Learning [0.0]
BackboneLearn is a framework for scaling mixed-integer optimization problems with indicator variables to high-dimensional problems.
BackboneLearn is built in Python and is user-friendly and easily implementable.
The source code of BackboneLearn is available on GitHub.
arXiv Detail & Related papers (2023-11-22T21:07:45Z) - End-to-end Feature Selection Approach for Learning Skinny Trees [13.388576838688202]
We propose a new optimization-based approach for feature selection in tree ensembles.
Skinny Trees is an end-to-end toolkit for feature selection in tree ensembles.
arXiv Detail & Related papers (2023-10-28T00:15:10Z) - bsnsing: A decision tree induction method based on recursive optimal
boolean rule composition [2.28438857884398]
This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process.
It develops an efficient search solver that is able to solve practical instances faster than commercial solvers.
arXiv Detail & Related papers (2022-05-30T17:13:57Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.