In Search of Trees: Decision-Tree Policy Synthesis for Black-Box Systems via Search
- URL: http://arxiv.org/abs/2409.03260v1
- Date: Thu, 5 Sep 2024 05:51:42 GMT
- Title: In Search of Trees: Decision-Tree Policy Synthesis for Black-Box Systems via Search
- Authors: Emir Demirović, Christian Schilling, Anna Lukina,
- Abstract summary: We present an approach to synthesise optimal decision-tree policies given a black-box environment and specification.
Our approach is a specialised search algorithm which systematically explores the space of decision trees under the given discretisation.
- Score: 6.74890780471356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision trees, owing to their interpretability, are attractive as control policies for (dynamical) systems. Unfortunately, constructing, or synthesising, such policies is a challenging task. Previous approaches do so by imitating a neural-network policy, approximating a tabular policy obtained via formal synthesis, employing reinforcement learning, or modelling the problem as a mixed-integer linear program. However, these works may require access to a hard-to-obtain accurate policy or a formal model of the environment (within reach of formal synthesis), and may not provide guarantees on the quality or size of the final tree policy. In contrast, we present an approach to synthesise optimal decision-tree policies given a black-box environment and specification, and a discretisation of the tree predicates, where optimality is defined with respect to the number of steps to achieve the goal. Our approach is a specialised search algorithm which systematically explores the (exponentially large) space of decision trees under the given discretisation. The key component is a novel pruning mechanism that significantly reduces the search space. Our approach represents a conceptually novel way of synthesising small decision-tree policies with optimality guarantees even for black-box environments with black-box specifications.
Related papers
- Statistical Analysis of Policy Space Compression Problem [54.1754937830779]
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems.
Reducing the policy space through policy compression emerges as a powerful, reward-free approach to accelerate the learning process.
This technique condenses the policy space into a smaller, representative set while maintaining most of the original effectiveness.
arXiv Detail & Related papers (2024-11-15T02:46:55Z) - SYMPOL: Symbolic Tree-Based On-Policy Reinforcement Learning [9.035959289139102]
We introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL.
SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions.
We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches.
arXiv Detail & Related papers (2024-08-16T14:04:40Z) - Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy
Search Over Policy Trees [5.250288418639076]
We propose an online POMDP solver called Lazy Cross-Entropy Search Over Policy Trees (LCEOPT)
At each planning step, our method uses a novel lazy Cross-Entropy method to search the space of policy trees.
Our method is surprisingly simple as compared to existing state-of-the-art methods, yet empirically outperforms them on several continuous-action POMDP problems.
arXiv Detail & Related papers (2023-05-14T03:12:53Z) - Optimal Decision Tree Policies for Markov Decision Processes [7.995360025953931]
We study the optimization of size-limited decision trees for Markov Decision Processes (MPDs)
We show that this is due to an inherent shortcoming of imitation learning, namely that complex policies cannot be represented using size-limited trees.
While there is generally a trade-off between the performance and interpretability of machine learning models, we find that OMDTs limited to a depth of 3 often perform close to the optimal limit.
arXiv Detail & Related papers (2023-01-30T18:51:02Z) - Policy learning for many outcomes of interest: Combining optimal policy
trees with multi-objective Bayesian optimisation [0.0]
Multi-Objective Policy Learning combines optimal decision trees for policy learning with a multi-objective Bayesian optimisation approach.
The method is applied to a real-world case-study of non-price rationing of anti-malarial medication in Kenya.
arXiv Detail & Related papers (2022-12-13T01:39:14Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - Policy Manifold Search: Exploring the Manifold Hypothesis for
Diversity-based Neuroevolution [4.920145245773581]
This paper proposes a novel method for diversity-based policy search via Neuroevolution.
We use the Quality-Diversity framework which provides a principled approach to policy search.
We also use the Jacobian of the inverse-mapping function to guide the search in the representation space.
arXiv Detail & Related papers (2021-04-27T18:52:03Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.