Iterative Depth-First Search for Fully Observable Non-Deterministic
Planning
- URL: http://arxiv.org/abs/2204.04322v1
- Date: Fri, 8 Apr 2022 23:10:30 GMT
- Title: Iterative Depth-First Search for Fully Observable Non-Deterministic
Planning
- Authors: Ramon Fraga Pereira, Andr\'e G. Pereira, Frederico Messa, and Giuseppe
De Giacomo
- Abstract summary: We develop a novel iterative depth-first search algorithm that solves FOND planning tasks and produces strong cyclic policies.
Our algorithm is explicitly designed for FOND planning, addressing more directly the non-deterministic aspect of FOND planning.
- Score: 25.2935633334145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully Observable Non-Deterministic (FOND) planning models uncertainty through
actions with non-deterministic effects. Existing FOND planning algorithms are
effective and employ a wide range of techniques. However, most of the existing
algorithms are not robust for dealing with both non-determinism and task size.
In this paper, we develop a novel iterative depth-first search algorithm that
solves FOND planning tasks and produces strong cyclic policies. Our algorithm
is explicitly designed for FOND planning, addressing more directly the
non-deterministic aspect of FOND planning, and it also exploits the benefits of
heuristic functions to make the algorithm more effective during the iterative
searching process. We compare our proposed algorithm to well-known FOND
planners, and show that it has robust performance over several distinct types
of FOND domains considering different metrics.
Related papers
- Depth-Bounded Epistemic Planning [50.42592219248395]
We present a novel algorithm for planning based on dynamic epistemic logic.
The novelty is that we limit the depth of reasoning of the planning agent to an upper bound b.
We show it to be complete with respect to planning tasks having a solution within bound b of reasoning depth.
arXiv Detail & Related papers (2024-06-03T09:30:28Z) - On Uncertainty Quantification for Near-Bayes Optimal Algorithms [2.622066970118316]
We show that it is possible to recover the Bayesian posterior defined by the task distribution, which is unknown but optimal in this setting, by building a martingale posterior using the algorithm.
Experiments based on a variety of non-NN and NN algorithms demonstrate the efficacy of our method.
arXiv Detail & Related papers (2024-03-28T12:42:25Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning
Problems [0.7614628596146599]
We propose a novel algorithm for adaptive step length selection in the classical SGD framework.
Under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements.
We show that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
arXiv Detail & Related papers (2023-05-17T06:22:11Z) - Structural Estimation of Markov Decision Processes in High-Dimensional
State Space with Finite-Time Guarantees [39.287388288477096]
We consider the task of estimating a structural model of dynamic decisions by a human agent based upon the observable history of implemented actions and visited states.
This problem has an inherent nested structure: in the inner problem, an optimal policy for a given reward function is identified while in the outer problem, a measure of fit is maximized.
We propose a single-loop estimation algorithm with finite time guarantees that is equipped to deal with high-dimensional state spaces.
arXiv Detail & Related papers (2022-10-04T00:11:38Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Waypoint Planning Networks [66.72790309889432]
We propose a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm.
We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value networks (VIN)
It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results.
arXiv Detail & Related papers (2021-05-01T18:02:01Z) - A bi-level encoding scheme for the clustered shortest-path tree problem
in multifactorial optimization [1.471992435706872]
The Clustered Shortest-Path Tree Problem (CluSPT) plays an important role in various types of optimization problems in real-life.
Recently, some Multifactorial Evolutionary Algorithm (MFEA) have been introduced to deal with the CluSPT.
This paper describes a MFEA-based approach to solve the CluSPT.
arXiv Detail & Related papers (2021-02-12T13:36:07Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Oracle-Efficient Regret Minimization in Factored MDPs with Unknown
Structure [57.90236104782219]
We study regret in non-episodic factored Markov decision processes (FMDPs)
All existing algorithms make the strong assumption that the factored structure of the FMDP is known to the learner in advance.
We provide the first algorithm that learns the structure of the FMDP while minimizing the regret.
arXiv Detail & Related papers (2020-09-13T12:30:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.