Related papers: Discovering State Equivalences in UCT Search Trees By Action Pruning

Discovering State Equivalences in UCT Search Trees By Action Pruning

URL: http://arxiv.org/abs/2510.26346v1
Date: Thu, 30 Oct 2025 10:54:43 GMT
Title: Discovering State Equivalences in UCT Search Trees By Action Pruning
Authors: Robin Schmöcker, Alexander Dockhorn, Bodo Rosenhahn,
Abstract summary: We show that Ideal Pruning Abstractions in UCT (IPA-UCT) outperforms OGA-UCT across a large range of test domains and iteration budgets.<n>We also show that both IPA and ASAP are special cases of a more general framework that we call p-ASAP.
Score: 64.43268969806098
License: http://creativecommons.org/licenses/by/4.0/
Abstract: One approach to enhance Monte Carlo Tree Search (MCTS) is to improve its sample efficiency by grouping/abstracting states or state-action pairs and sharing statistics within a group. Though state-action pair abstractions are mostly easy to find in algorithms such as On the Go Abstractions in Upper Confidence bounds applied to Trees (OGA-UCT), nearly no state abstractions are found in either noisy or large action space settings due to constraining conditions. We provide theoretical and empirical evidence for this claim, and we slightly alleviate this state abstraction problem by proposing a weaker state abstraction condition that trades a minor loss in accuracy for finding many more abstractions. We name this technique Ideal Pruning Abstractions in UCT (IPA-UCT), which outperforms OGA-UCT (and any of its derivatives) across a large range of test domains and iteration budgets as experimentally validated. IPA-UCT uses a different abstraction framework from Abstraction of State-Action Pairs (ASAP) which is the one used by OGA-UCT, which we name IPA. Furthermore, we show that both IPA and ASAP are special cases of a more general framework that we call p-ASAP which itself is a special case of the ASASAP framework.

Related papers

Grouping Nodes With Known Value Differences: A Lossless UCT-based Abstraction Algorithm [64.43268969806098]
A core challenge of Monte Carlo Tree Search (MCTS) is its sample efficiency, which can be improved by grouping state-action pairs.<n>We break with the paradigm of grouping value-equivalent states or state-action pairs and instead group states and state-action pairs with possibly different values.<n>We call this abstraction framework Known Value Difference Abstractions ( KVDA), which infers the value differences by analysis of the immediate rewards.
arXiv Detail & Related papers (2025-10-29T11:03:44Z)
Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms [64.43268969806098]
One weakness of Monte Carlo Tree Search (MCTS) is its sample efficiency.<n>We propose and empirically evaluate several alternative intra-abstraction policies.
arXiv Detail & Related papers (2025-10-28T11:00:30Z)
AUPO - Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm [64.43268969806098]
We introduce a novel, drop-in modification to Monte Carlo Tree Search's (MCTS) decision policy that we call AUPO.<n> Comparisons based on a range of IPPC benchmark problems show that AUPO clearly outperforms MCTS.
arXiv Detail & Related papers (2025-10-27T11:04:22Z)
Causal Abstraction Inference under Lossy Representations [53.18851962820361]
We introduce a new type of abstractions called projected abstractions that generalize existing definitions to accommodate lossy representations.<n>We show how to construct a projected abstraction from the low-level model and how it translates equivalent observational, interventional, and counterfactual causal queries from low to high-level.
arXiv Detail & Related papers (2025-09-25T21:20:42Z)
Time-critical and confidence-based abstraction dropping methods [44.99833362998488]
Non-exact abstractions make convergence to the optimal action in the abstract space impossible.<n>We propose two novel abstraction dropping schemes, namely OGA-IAAD and OGA-CAD.
arXiv Detail & Related papers (2025-07-03T15:12:05Z)
Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction [27.53460927687747]
We propose an action abstraction based on the compositional structure between a state and sub-actions. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state.
arXiv Detail & Related papers (2024-06-02T04:31:30Z)
Hierarchical State Abstraction Based on Structural Information Principles [70.24495170921075]
We propose a novel mathematical Structural Information principles-based State Abstraction framework, namely SISA, from the information-theoretic perspective. SISA is a general framework that can be flexibly integrated with different representation-learning objectives to improve their performances further.
arXiv Detail & Related papers (2023-04-24T11:06:52Z)
A Direct Approximation of AIXI Using Logical State Abstractions [6.570488724773507]
We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents. We address the problem of selecting the right subset of features to form state abstractions. Exact Bayesian model learning is then achieved using a suitable generalisation of Context Tree Weighting over abstract state sequences.
arXiv Detail & Related papers (2022-10-13T11:30:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.