Related papers: From Roots to Rewards: Dynamic Tree Reasoning with RL

From Roots to Rewards: Dynamic Tree Reasoning with RL

URL: http://arxiv.org/abs/2507.13142v2
Date: Fri, 18 Jul 2025 14:38:54 GMT
Title: From Roots to Rewards: Dynamic Tree Reasoning with RL
Authors: Ahmed Bahloul, Simon Malberg,
Abstract summary: Tree-structured reasoning methods mitigate issues by decomposing questions into hierarchical structures and selecting answers through confidence-weighted aggregation of parametric and retrieved knowledge.<n>We present a new paradigm for treestructured reasoning that balances the reliability of probabilistic frameworks with the flexibility required for real-world question answering systems.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Tree-structured reasoning methods, particularly the Probabilistic Tree-of-Thought (ProbTree)(Cao et al., 2023) framework, mitigate these issues by decomposing questions into hierarchical structures and selecting answers through confidence-weighted aggregation of parametric and retrieved knowledge (Yao et al., 2023). However, ProbTree's static implementation introduces two key limitations: (1) the reasoning tree is fixed during the initial construction phase, preventing dynamic adaptation to intermediate results, and (2) each node requires exhaustive evaluation of all possible solution strategies, creating computational inefficiency. We present a dynamic reinforcement learning (Sutton and Barto, 2018) framework that transforms tree-based reasoning into an adaptive process. Our approach incrementally constructs the reasoning tree based on real-time confidence estimates, while learning optimal policies for action selection (decomposition, retrieval, or aggregation). This maintains ProbTree's probabilistic rigor while improving both solution quality and computational efficiency through selective expansion and focused resource allocation. The work establishes a new paradigm for treestructured reasoning that balances the reliability of probabilistic frameworks with the flexibility required for real-world question answering systems.

Related papers

Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.<n>We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Provably optimal decision trees with arbitrary splitting rules in polynomial time [1.9405875431318445]
We provide the first axiomatic definition of decision trees.<n>We refer to decision trees that satisfy the axioms as proper decision trees.<n>We develop the first provably correct-time algorithm for solving the optimal decision tree problem.
arXiv Detail & Related papers (2025-03-03T12:14:53Z)
Learning accurate and interpretable tree-based models [27.203303726977616]
We develop approaches to design tree-based learning algorithms given repeated access to data from the same domain.<n>We propose novel parameterized classes of node splitting criteria in top-down algorithms, which interpolate between popularly used entropy and Gini impurity based criteria.<n>We extend our results to tuning popular tree-based ensembles, including random forests and gradient-boosted trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z)
Divide, Conquer, Combine Bayesian Decision Tree Sampling [1.1879716317856945]
Decision trees are commonly used predictive models due to their flexibility and interpretability. This paper is directed at quantifying the uncertainty of decision tree predictions by employing a Bayesian inference approach.
arXiv Detail & Related papers (2024-03-26T23:14:15Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation [8.394633341978007]
Decision tree learning is increasingly being used for pointwise inference. We show that adaptive decision trees can fail to achieve convergence rates of convergence in the norm with non-vanishing probability. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures.
arXiv Detail & Related papers (2022-11-19T21:28:30Z)
RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees [47.745218107037786]
We propose RLET, a Reinforcement Learning based Entailment Tree generation framework. RLET iteratively performs single step reasoning with sentence selection and deduction generation modules. Experiments on three settings of the EntailmentBank dataset demonstrate the strength of using RL framework.
arXiv Detail & Related papers (2022-10-31T06:45:05Z)
Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects. We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z)
Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies [76.83991682238666]
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs) We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching.
arXiv Detail & Related papers (2020-02-12T17:43:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.