Dual Formulation for Chance Constrained Stochastic Shortest Path with
Application to Autonomous Vehicle Behavior Planning
- URL: http://arxiv.org/abs/2302.13115v1
- Date: Sat, 25 Feb 2023 16:40:00 GMT
- Title: Dual Formulation for Chance Constrained Stochastic Shortest Path with
Application to Autonomous Vehicle Behavior Planning
- Authors: Rashid Alyassi and Majid Khonji
- Abstract summary: The Constrained Shortest Path problem (C-SSP) is a formalism for planning in environments under certain types of operating constraints.
This work's first contribution is an exact integer linear formulation for Chance-constrained policies.
Third, we show that the CC-SSP formalism can be generalized to account for constraints that span through multiple time steps.
- Score: 3.655021726150368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous vehicles face the problem of optimizing the expected performance
of subsequent maneuvers while bounding the risk of collision with surrounding
dynamic obstacles. These obstacles, such as agent vehicles, often exhibit
stochastic transitions that should be accounted for in a timely and safe
manner. The Constrained Stochastic Shortest Path problem (C-SSP) is a formalism
for planning in stochastic environments under certain types of operating
constraints. While C-SSP allows specifying constraints in the planning problem,
it does not allow for bounding the probability of constraint violation, which
is desired in safety-critical applications. This work's first contribution is
an exact integer linear programming formulation for Chance-constrained SSP
(CC-SSP) that attains deterministic policies. Second, a randomized rounding
procedure is presented for stochastic policies. Third, we show that the CC-SSP
formalism can be generalized to account for constraints that span through
multiple time steps. Evaluation results show the usefulness of our approach in
benchmark problems compared to existing approaches.
Related papers
- Flipping-based Policy for Chance-Constrained Markov Decision Processes [9.404184937255694]
This paper proposes a textitflipping-based policy for Chance-Constrained Markov Decision Processes ( CCMDPs)
The flipping-based policy selects the next action by tossing a potentially distorted coin between two action candidates.
We demonstrate that the flipping-based policy can improve the performance of the existing safe RL algorithms under the same limits of safety constraints.
arXiv Detail & Related papers (2024-10-09T02:00:39Z) - Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - ConstrainedZero: Chance-Constrained POMDP Planning using Learned Probabilistic Failure Surrogates and Adaptive Safety Constraints [34.9739641898452]
This work introduces the ConstrainedZero policy algorithm that solves CC-POMDPs in belief space by learning neural network approximations of the optimal value and policy.
Results show that by separating safety constraints from the objective we can achieve a target level of safety without optimizing the balance between rewards and costs.
arXiv Detail & Related papers (2024-05-01T17:17:22Z) - Online Constraint Tightening in Stochastic Model Predictive Control: A
Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems.
We propose a data-driven approach for learning the constraint-tightening parameters online during control.
Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - Learning Stochastic Parametric Differentiable Predictive Control
Policies [2.042924346801313]
We present a scalable alternative called parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies.
SP-DPC is formulated as a deterministic approximation to the parametric constrained optimal control problem.
We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop constraints and chance satisfaction.
arXiv Detail & Related papers (2022-03-02T22:46:32Z) - Learning-based Preference Prediction for Constrained Multi-Criteria
Path-Planning [12.457788665461312]
Constrained path-planning for Autonomous Ground Vehicles (AGV) is one such application.
We leverage knowledge acquired through offline simulations by training a neural network model to predict the uncertain criterion.
We integrate this model inside a path-planner which can solve problems online.
arXiv Detail & Related papers (2021-08-02T17:13:45Z) - Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks [59.419152768018506]
We show that any optimal policy necessarily satisfies the k-SP constraint.
We propose a novel cost function that penalizes the policy violating SP constraint, instead of completely excluding it.
Our experiments on MiniGrid, DeepMind Lab, Atari, and Fetch show that the proposed method significantly improves proximal policy optimization (PPO)
arXiv Detail & Related papers (2021-07-13T21:39:21Z) - Pointwise Feasibility of Gaussian Process-based Safety-Critical Control
under Model Uncertainty [77.18483084440182]
Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs) are popular tools for enforcing safety and stability of a controlled system, respectively.
We present a Gaussian Process (GP)-based approach to tackle the problem of model uncertainty in safety-critical controllers that use CBFs and CLFs.
arXiv Detail & Related papers (2021-06-13T23:08:49Z) - Trajectory Optimization of Chance-Constrained Nonlinear Stochastic
Systems for Motion Planning and Control [9.35511513240868]
We compute a sub-optimal solution for a continuous-time chance-constrained nonlinear optimal control problem (SNOC) problem.
The proposed method enables motion planning and control of robotic systems under uncertainty.
arXiv Detail & Related papers (2021-06-05T05:15:05Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.