AUPO - Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm
- URL: http://arxiv.org/abs/2510.23214v1
- Date: Mon, 27 Oct 2025 11:04:22 GMT
- Title: AUPO - Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm
- Authors: Robin Schmöcker, Alexander Dockhorn, Bodo Rosenhahn,
- Abstract summary: We introduce a novel, drop-in modification to Monte Carlo Tree Search's (MCTS) decision policy that we call AUPO.<n> Comparisons based on a range of IPPC benchmark problems show that AUPO clearly outperforms MCTS.
- Score: 64.43268969806098
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel, drop-in modification to Monte Carlo Tree Search's (MCTS) decision policy that we call AUPO. Comparisons based on a range of IPPC benchmark problems show that AUPO clearly outperforms MCTS. AUPO is an automatic action abstraction algorithm that solely relies on reward distribution statistics acquired during the MCTS. Thus, unlike other automatic abstraction algorithms, AUPO requires neither access to transition probabilities nor does AUPO require a directed acyclic search graph to build its abstraction, allowing AUPO to detect symmetric actions that state-of-the-art frameworks like ASAP struggle with when the resulting symmetric states are far apart in state space. Furthermore, as AUPO only affects the decision policy, it is not mutually exclusive with other abstraction techniques that only affect the tree search.
Related papers
- Discovering State Equivalences in UCT Search Trees By Action Pruning [64.43268969806098]
We show that Ideal Pruning Abstractions in UCT (IPA-UCT) outperforms OGA-UCT across a large range of test domains and iteration budgets.<n>We also show that both IPA and ASAP are special cases of a more general framework that we call p-ASAP.
arXiv Detail & Related papers (2025-10-30T10:54:43Z) - Grouping Nodes With Known Value Differences: A Lossless UCT-based Abstraction Algorithm [64.43268969806098]
A core challenge of Monte Carlo Tree Search (MCTS) is its sample efficiency, which can be improved by grouping state-action pairs.<n>We break with the paradigm of grouping value-equivalent states or state-action pairs and instead group states and state-action pairs with possibly different values.<n>We call this abstraction framework Known Value Difference Abstractions ( KVDA), which infers the value differences by analysis of the immediate rewards.
arXiv Detail & Related papers (2025-10-29T11:03:44Z) - Investigating Intra-Abstraction Policies For Non-exact Abstraction Algorithms [64.43268969806098]
One weakness of Monte Carlo Tree Search (MCTS) is its sample efficiency.<n>We propose and empirically evaluate several alternative intra-abstraction policies.
arXiv Detail & Related papers (2025-10-28T11:00:30Z) - Time-critical and confidence-based abstraction dropping methods [44.99833362998488]
Non-exact abstractions make convergence to the optimal action in the abstract space impossible.<n>We propose two novel abstraction dropping schemes, namely OGA-IAAD and OGA-CAD.
arXiv Detail & Related papers (2025-07-03T15:12:05Z) - Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample [53.23474626420103]
Facial action unit (AU) detection remains a challenging task, due to the subtlety, dynamics, and diversity of AUs.
We propose a novel AU detection framework called AC2D by adaptively constraining self-attention weight distribution.
Our method achieves competitive performance compared to state-of-the-art AU detection approaches on challenging benchmarks.
arXiv Detail & Related papers (2024-10-02T05:51:24Z) - Accelerating Monte Carlo Tree Search with Probability Tree State
Abstraction [11.49169644917995]
We propose a novel probability tree state abstraction (PTSA) algorithm to improve the search efficiency of Monte Carlo Tree Search (MCTS)
A general tree state abstraction with path transitivity is defined. In addition, the probability tree state abstraction is proposed for fewer mistakes during the aggregation step.
Experimental results on different tasks demonstrate that our method can accelerate the training process of state-of-the-art algorithms with 10%-45% search space reduction.
arXiv Detail & Related papers (2023-10-10T10:55:12Z) - Counterexample Guided Abstraction Refinement with Non-Refined
Abstractions for Multi-Agent Path Finding [15.99072005190786]
We propose a novel CEGAR-style solver for MAPF based on SAT in which some abstractions are deliberately left non-refined.
Non-refining however yields order-of-magnitude smaller SAT encodings than those of the previous approach.
arXiv Detail & Related papers (2023-01-20T17:18:49Z) - A Direct Approximation of AIXI Using Logical State Abstractions [6.570488724773507]
We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents.
We address the problem of selecting the right subset of features to form state abstractions.
Exact Bayesian model learning is then achieved using a suitable generalisation of Context Tree Weighting over abstract state sequences.
arXiv Detail & Related papers (2022-10-13T11:30:56Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies [76.83991682238666]
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs)
We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching.
arXiv Detail & Related papers (2020-02-12T17:43:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.