Time-critical and confidence-based abstraction dropping methods
- URL: http://arxiv.org/abs/2507.02703v1
- Date: Thu, 03 Jul 2025 15:12:05 GMT
- Title: Time-critical and confidence-based abstraction dropping methods
- Authors: Robin Schmöcker, Lennart Kampmann, Alexander Dockhorn,
- Abstract summary: Non-exact abstractions make convergence to the optimal action in the abstract space impossible.<n>We propose two novel abstraction dropping schemes, namely OGA-IAAD and OGA-CAD.
- Score: 44.99833362998488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One paradigm of Monte Carlo Tree Search (MCTS) improvements is to build and use state and/or action abstractions during the tree search. Non-exact abstractions, however, introduce an approximation error making convergence to the optimal action in the abstract space impossible. Hence, as proposed as a component of Elastic Monte Carlo Tree Search by Xu et al., abstraction algorithms should eventually drop the abstraction. In this paper, we propose two novel abstraction dropping schemes, namely OGA-IAAD and OGA-CAD which can yield clear performance improvements whilst being safe in the sense that the dropping never causes any notable performance degradations contrary to Xu's dropping method. OGA-IAAD is designed for time critical settings while OGA-CAD is designed to improve the MCTS performance with the same number of iterations.
Related papers
- Monte Carlo Tree Diffusion for System 2 Planning [57.50512800900167]
We introduce Monte Carlo Tree Diffusion (MCTD), a novel framework that integrates the generative strength of diffusion models with the adaptive search capabilities of Monte Carlo Tree Search (MCTS)<n>Our method reconceptualizes denoising as a tree-structured process, allowing partially denoised plans to be iteratively evaluated, pruned, and refined.
arXiv Detail & Related papers (2025-02-11T02:51:42Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.<n>We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.<n> Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction [27.53460927687747]
We propose an action abstraction based on the compositional structure between a state and sub-actions.
Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state.
arXiv Detail & Related papers (2024-06-02T04:31:30Z) - Accelerating Monte Carlo Tree Search with Probability Tree State
Abstraction [11.49169644917995]
We propose a novel probability tree state abstraction (PTSA) algorithm to improve the search efficiency of Monte Carlo Tree Search (MCTS)
A general tree state abstraction with path transitivity is defined. In addition, the probability tree state abstraction is proposed for fewer mistakes during the aggregation step.
Experimental results on different tasks demonstrate that our method can accelerate the training process of state-of-the-art algorithms with 10%-45% search space reduction.
arXiv Detail & Related papers (2023-10-10T10:55:12Z) - Hierarchical State Abstraction Based on Structural Information
Principles [70.24495170921075]
We propose a novel mathematical Structural Information principles-based State Abstraction framework, namely SISA, from the information-theoretic perspective.
SISA is a general framework that can be flexibly integrated with different representation-learning objectives to improve their performances further.
arXiv Detail & Related papers (2023-04-24T11:06:52Z) - Model-based Causal Bayesian Optimization [78.120734120667]
We propose model-based causal Bayesian optimization (MCBO)
MCBO learns a full system model instead of only modeling intervention-reward pairs.
Unlike in standard Bayesian optimization, our acquisition function cannot be evaluated in closed form.
arXiv Detail & Related papers (2022-11-18T14:28:21Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Stochastic Reweighted Gradient Descent [4.355567556995855]
We propose an importance-sampling-based algorithm we call SRG (stochastic reweighted gradient)
We pay particular attention to the time and memory overhead of our proposed method.
We present empirical results to support our findings.
arXiv Detail & Related papers (2021-03-23T04:09:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.