Interpretable Decision Tree Search as a Markov Decision Process
- URL: http://arxiv.org/abs/2309.12701v4
- Date: Thu, 13 Jun 2024 11:33:26 GMT
- Title: Interpretable Decision Tree Search as a Markov Decision Process
- Authors: Hector Kohler, Riad Akrour, Philippe Preux,
- Abstract summary: Finding an optimal decision tree for a supervised learning task is a challenging problem to solve at scale.
It was recently proposed to frame the problem as a Markov Decision Problem (MDP) and use deep reinforcement learning to tackle scaling.
We propose instead to scale the resolution of such MDPs using an information-theoretic tests generating function that generates for every state.
- Score: 8.530182510074983
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale. It was recently proposed to frame the problem as a Markov Decision Problem (MDP) and use deep reinforcement learning to tackle scaling. Unfortunately, these methods are not competitive with the current branch-and-bound state-of-the-art. We propose instead to scale the resolution of such MDPs using an information-theoretic tests generating function that heuristically, and dynamically for every state, limits the set of admissible test actions to a few good candidates. As a solver, we show empirically that our algorithm is at the very least competitive with branch-and-bound alternatives. As a machine learning tool, a key advantage of our approach is to solve for multiple complexity-performance trade-offs at virtually no additional cost. With such a set of solutions, a user can then select the tree that generalizes best and which has the interpretability level that best suits their needs, which no current branch-and-bound method allows.
Related papers
- An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem [5.092968949752308]
This work proposes an unsupervised learning framework combined with Maximums for MMCP.
A crucial observation is that each solution corresponds to at least one spanning tree.
We conduct extensive experiments to evaluate our framework and give a specific application.
arXiv Detail & Related papers (2024-08-16T02:07:34Z) - TreeDQN: Learning to minimize Branch-and-Bound tree [78.52895577861327]
Branch-and-Bound is a convenient approach to solving optimization tasks in the form of Mixed Linear Programs.
The efficiency of the solver depends on the branchning used to select a variable for splitting.
We propose a reinforcement learning method that can efficiently learn the branching.
arXiv Detail & Related papers (2023-06-09T14:01:26Z) - Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees
with Continuous Features [5.663538370244174]
We present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees.
Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
arXiv Detail & Related papers (2022-06-23T17:19:29Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - What's Wrong with Deep Learning in Tree Search for Combinatorial
Optimization [8.879790406465556]
We present an open-source benchmark suite for the NP-hard Maximum Independent Set problem, in both its weighted and unweighted variants.
We also conduct an in-depth analysis of the popular guided tree search algorithm by Li et al. [NeurIPS 2018], testing various configurations on small and large synthetic and real-world graphs.
We show that the graph convolution network used in the tree search does not learn a meaningful representation of the solution structure, and can in fact be replaced by random values.
arXiv Detail & Related papers (2022-01-25T17:37:34Z) - Modularity in Reinforcement Learning via Algorithmic Independence in
Credit Assignment [79.5678820246642]
We show that certain action-value methods are more sample efficient than policy-gradient methods on transfer problems that require only sparse changes to a sequence of previously optimal decisions.
We generalize the recently proposed societal decision-making framework as a more granular formalism than the Markov decision process.
arXiv Detail & Related papers (2021-06-28T21:29:13Z) - Learning to Schedule Heuristics in Branch-and-Bound [25.79025327341732]
Real-world applications typically require finding good solutions early in the search to enable fast decision-making.
We propose the first data-driven framework for schedulings in an exact MIP solver.
Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on a class of challenging instances.
arXiv Detail & Related papers (2021-03-18T14:49:52Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.