Robust Stochastic Shortest-Path Planning via Risk-Sensitive Incremental Sampling
- URL: http://arxiv.org/abs/2408.08668v1
- Date: Fri, 16 Aug 2024 11:21:52 GMT
- Title: Robust Stochastic Shortest-Path Planning via Risk-Sensitive Incremental Sampling
- Authors: Clinton Enwerem, Erfaun Noorani, John S. Baras, Brian M. Sadler,
- Abstract summary: We propose a risk-aware Rapidly-Exploring Random Trees (RRT*) planning algorithm for Shortest-Path (SSP) problems.
Our motivation rests on the step-wise coherence of the Conditional Value-at-Risk (CVaR) risk measure and the optimal substructure of the SSP problem.
Our simulation results show that incorporating risk into the tree growth process yields paths with lengths that are significantly less sensitive to variations in the noise parameter.
- Score: 9.651071174735804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the pervasiveness of Stochastic Shortest-Path (SSP) problems in high-risk industries, such as last-mile autonomous delivery and supply chain management, robust planning algorithms are crucial for ensuring successful task completion while mitigating hazardous outcomes. Mainstream chance-constrained incremental sampling techniques for solving SSP problems tend to be overly conservative and typically do not consider the likelihood of undesirable tail events. We propose an alternative risk-aware approach inspired by the asymptotically-optimal Rapidly-Exploring Random Trees (RRT*) planning algorithm, which selects nodes along path segments with minimal Conditional Value-at-Risk (CVaR). Our motivation rests on the step-wise coherence of the CVaR risk measure and the optimal substructure of the SSP problem. Thus, optimizing with respect to the CVaR at each sampling iteration necessarily leads to an optimal path in the limit of the sample size. We validate our approach via numerical path planning experiments in a two-dimensional grid world with obstacles and stochastic path-segment lengths. Our simulation results show that incorporating risk into the tree growth process yields paths with lengths that are significantly less sensitive to variations in the noise parameter, or equivalently, paths that are more robust to environmental uncertainty. Algorithmic analyses reveal similar query time and memory space complexity to the baseline RRT* procedure, with only a marginal increase in processing time. This increase is offset by significantly lower noise sensitivity and reduced planner failure rates.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Risk-averse Learning with Non-Stationary Distributions [18.15046585146849]
In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time.
We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure.
We show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions.
arXiv Detail & Related papers (2024-04-03T18:16:47Z) - CBAGAN-RRT: Convolutional Block Attention Generative Adversarial Network
for Sampling-Based Path Planning [0.0]
We propose a novel image-based learning algorithm (CBAGAN-RRT) using a Convolutional Block Attention Generative Adversarial Network.
The probability distribution of the paths generated from our GAN model is used to guide the sampling process for the RRT algorithm.
We train and test our network on the dataset generated by citezhang 2021 and demonstrate that our algorithm outperforms the previous state-of-the-art algorithms.
arXiv Detail & Related papers (2023-05-13T20:06:53Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - Structural Estimation of Markov Decision Processes in High-Dimensional
State Space with Finite-Time Guarantees [39.287388288477096]
We consider the task of estimating a structural model of dynamic decisions by a human agent based upon the observable history of implemented actions and visited states.
This problem has an inherent nested structure: in the inner problem, an optimal policy for a given reward function is identified while in the outer problem, a measure of fit is maximized.
We propose a single-loop estimation algorithm with finite time guarantees that is equipped to deal with high-dimensional state spaces.
arXiv Detail & Related papers (2022-10-04T00:11:38Z) - Risk-aware Stochastic Shortest Path [0.0]
We treat the problem of risk-aware control for shortest path (SSP) on Markov decision processes (MDP)
We present an alternative view, instead optimizing conditional value-at-risk (CVaR), an established risk measure.
arXiv Detail & Related papers (2022-03-03T10:59:54Z) - Online Estimation and Optimization of Utility-Based Shortfall Risk [0.9999629695552195]
We consider the problem of estimating Utility-Based Shortfall Risk (UBSR)
We cast the UBSR estimation problem as a root finding problem, and propose approximation-based estimations schemes.
We derive non-asymptotic bounds on the estimation error in the number of samples.
arXiv Detail & Related papers (2021-11-16T22:16:03Z) - Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes.
Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - MLE-guided parameter search for task loss minimization in neural
sequence modeling [83.83249536279239]
Neural autoregressive sequence models are used to generate sequences in a variety of natural language processing (NLP) tasks.
We propose maximum likelihood guided parameter search (MGS), which samples from a distribution over update directions that is a mixture of random search around the current parameters and around the maximum likelihood gradient.
Our experiments show that MGS is capable of optimizing sequence-level losses, with substantial reductions in repetition and non-termination in sequence completion, and similar improvements to those of minimum risk training in machine translation.
arXiv Detail & Related papers (2020-06-04T22:21:22Z) - Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm.
The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.