Related papers: Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms

Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms

URL: http://arxiv.org/abs/2510.24297v1
Date: Tue, 28 Oct 2025 11:00:30 GMT
Title: Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms
Authors: Robin Schmöcker, Alexander Dockhorn, Bodo Rosenhahn,
Abstract summary: One weakness of Monte Carlo Tree Search (MCTS) is its sample efficiency.<n>We propose and empirically evaluate several alternative intra-abstraction policies.
Score: 64.43268969806098
License: http://creativecommons.org/licenses/by/4.0/
Abstract: One weakness of Monte Carlo Tree Search (MCTS) is its sample efficiency which can be addressed by building and using state and/or action abstractions in parallel to the tree search such that information can be shared among nodes of the same layer. The primary usage of abstractions for MCTS is to enhance the Upper Confidence Bound (UCB) value during the tree policy by aggregating visits and returns of an abstract node. However, this direct usage of abstractions does not take the case into account where multiple actions with the same parent might be in the same abstract node, as these would then all have the same UCB value, thus requiring a tiebreak rule. In state-of-the-art abstraction algorithms such as pruned On the Go Abstractions (pruned OGA), this case has not been noticed, and a random tiebreak rule was implicitly chosen. In this paper, we propose and empirically evaluate several alternative intra-abstraction policies, several of which outperform the random policy across a majority of environments and parameter settings.

Related papers

Discovering State Equivalences in UCT Search Trees By Action Pruning [64.43268969806098]
We show that Ideal Pruning Abstractions in UCT (IPA-UCT) outperforms OGA-UCT across a large range of test domains and iteration budgets.<n>We also show that both IPA and ASAP are special cases of a more general framework that we call p-ASAP.
arXiv Detail & Related papers (2025-10-30T10:54:43Z)
Grouping Nodes With Known Value Differences: A Lossless UCT-based Abstraction Algorithm [64.43268969806098]
A core challenge of Monte Carlo Tree Search (MCTS) is its sample efficiency, which can be improved by grouping state-action pairs.<n>We break with the paradigm of grouping value-equivalent states or state-action pairs and instead group states and state-action pairs with possibly different values.<n>We call this abstraction framework Known Value Difference Abstractions ( KVDA), which infers the value differences by analysis of the immediate rewards.
arXiv Detail & Related papers (2025-10-29T11:03:44Z)
AUPO - Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm [64.43268969806098]
We introduce a novel, drop-in modification to Monte Carlo Tree Search's (MCTS) decision policy that we call AUPO.<n> Comparisons based on a range of IPPC benchmark problems show that AUPO clearly outperforms MCTS.
arXiv Detail & Related papers (2025-10-27T11:04:22Z)
Using Large Language Models for Abstraction of Planning Domains - Extended Version [6.021787236982658]
We model the agent's concrete behaviors in PDDL and investigate the use of in-context learning with large language models (LLMs)<n>We consider three categories of abstractions: abstraction of choice of alternative concrete actions, abstraction of sequences of concrete actions, and abstraction of action/predicate parameters.<n>The generated abstract PDDL domains and problem instances are then checked by symbolic validation tools as well as human experts.
arXiv Detail & Related papers (2025-10-23T06:27:03Z)
Causal Abstraction Inference under Lossy Representations [53.18851962820361]
We introduce a new type of abstractions called projected abstractions that generalize existing definitions to accommodate lossy representations.<n>We show how to construct a projected abstraction from the low-level model and how it translates equivalent observational, interventional, and counterfactual causal queries from low to high-level.
arXiv Detail & Related papers (2025-09-25T21:20:42Z)
Time-critical and confidence-based abstraction dropping methods [44.99833362998488]
Non-exact abstractions make convergence to the optimal action in the abstract space impossible.<n>We propose two novel abstraction dropping schemes, namely OGA-IAAD and OGA-CAD.
arXiv Detail & Related papers (2025-07-03T15:12:05Z)
Entity Disambiguation with Entity Definitions [50.01142092276296]
Local models have recently attained astounding performances in Entity Disambiguation (ED) Previous works limited their studies to using, as the textual representation of each candidate, only its Wikipedia title. In this paper, we address this limitation and investigate to what extent more expressive textual representations can mitigate it. We report a new state of the art on 2 out of 6 benchmarks we consider and strongly improve the generalization capability over unseen patterns.
arXiv Detail & Related papers (2022-10-11T17:46:28Z)
Elastic Monte Carlo Tree Search with State Abstraction for Strategy Game Playing [58.720142291102135]
Strategy video games challenge AI agents with their search space caused by complex game elements. State abstraction is a popular technique that reduces the state space complexity. We propose Elastic MCTS, an algorithm that uses state abstraction to play strategy games.
arXiv Detail & Related papers (2022-05-30T14:18:45Z)
MDP Abstraction with Successor Features [14.433551477386318]
We study abstraction in the context of reinforcement learning, in which agents may perform state or temporal abstractions. In this work, we propose successor abstraction, a novel abstraction scheme building on successor features. Our successor abstraction allows us to learn abstract environment models with semantics that are transferable across different environments.
arXiv Detail & Related papers (2021-10-18T11:35:08Z)
Randomized Value Functions via Posterior State-Abstraction Sampling [21.931580762349096]
We argue that an agent seeking out latent task structure must explicitly represent and maintain its uncertainty over that structure. We introduce a practical algorithm for doing this using two posterior distributions over state abstractions and abstract-state values. In empirically validating our approach, we find that substantial performance gains lie in the multi-task setting.
arXiv Detail & Related papers (2020-10-05T23:04:18Z)
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies [76.83991682238666]
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs) We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching.
arXiv Detail & Related papers (2020-02-12T17:43:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.