Related papers: Cooperative Trajectory Planning in Uncertain Environments with Monte Carlo Tree Search and Risk Metrics

Cooperative Trajectory Planning in Uncertain Environments with Monte Carlo Tree Search and Risk Metrics

URL: http://arxiv.org/abs/2203.04452v1
Date: Wed, 9 Mar 2022 00:14:41 GMT
Title: Cooperative Trajectory Planning in Uncertain Environments with Monte Carlo Tree Search and Risk Metrics
Authors: Philipp Stegmaier, Karl Kurzer, J. Marius Z\"ollner
Abstract summary: We extend an existing cooperative trajectory planning approach based on Monte Carlo Tree Search for continuous action spaces. It does so by explicitly modeling uncertainties in the form of a root belief state, from which start states for trees are sampled. It can be demonstrated that the integration of risk metrics in the final selection policy consistently outperforms a baseline in uncertain environments.
Score: 2.658812114255374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated vehicles require the ability to cooperate with humans for a smooth integration into today's traffic. While the concept of cooperation is well known, the development of a robust and efficient cooperative trajectory planning method is still a challenge. One aspect of this challenge is the uncertainty surrounding the state of the environment due to limited sensor accuracy. This uncertainty can be represented by a Partially Observable Markov Decision Process. Our work addresses this problem by extending an existing cooperative trajectory planning approach based on Monte Carlo Tree Search for continuous action spaces. It does so by explicitly modeling uncertainties in the form of a root belief state, from which start states for trees are sampled. After the trees have been constructed with Monte Carlo Tree Search, their results are aggregated into return distributions using kernel regression. For the final selection, we apply two risk metrics, namely a Lower Confidence Bound and a Conditional Value at Risk. It can be demonstrated that the integration of risk metrics in the final selection policy consistently outperforms a baseline in uncertain environments, generating considerably safer trajectories.

Related papers

Anytime Probabilistically Constrained Provably Convergent Online Belief Space Planning [7.081396107231381]
We present an anytime approach employing the Monte Carlo Tree Search (MCTS) method in continuous domains. We prove convergence in probability with an exponential rate of a version of our algorithms and study proposed techniques via extensive simulations.
arXiv Detail & Related papers (2024-11-11T04:42:18Z)
Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making Framework [79.088116316919]
Connected Autonomous Vehicles (CAVs) have begun to open road testing around the world, but their safety and efficiency performance in complex scenarios is still not satisfactory. This paper proposes CoDrivingLLM, an interactive and learnable LLM-driven cooperative driving framework.
arXiv Detail & Related papers (2024-09-19T14:36:00Z)
Optimizing Agricultural Order Fulfillment Systems: A Hybrid Tree Search Approach [1.1470070927586018]
Efficient order fulfillment is vital in the agricultural industry, particularly due to the seasonal nature of seed supply chains. This paper addresses the challenge of optimizing seed orders fulfillment in a centralized warehouse where orders are processed in waves. We propose an adaptive hybrid tree search algorithm that combines Monte Carlo tree search with domain-specific knowledge.
arXiv Detail & Related papers (2024-07-19T01:25:39Z)
Cooperative Probabilistic Trajectory Forecasting under Occlusion [110.4960878651584]
Occlusion-aware planning often requires communicating the information of the occluded object to the ego agent for safe navigation. In this paper, we design an end-to-end network that cooperatively estimates the current states of occluded pedestrian in the reference frame of ego agent. We show that the uncertainty-aware trajectory prediction of occluded pedestrian by the ego agent is almost similar to the ground truth trajectory assuming no occlusion.
arXiv Detail & Related papers (2023-12-06T05:36:52Z)
Monte Carlo Planning in Hybrid Belief POMDPs [7.928094304325113]
We present Hybrid Belief Monte Carlo Planning (HB-MCP) that utilizes the Monte Carlo Tree Search (MCTS) algorithm to solve a POMDP. We show how the upper confidence bound (UCB) exploration bonus can be leveraged to guide the growth of hypotheses trees alongside the belief trees. We then evaluate our approach in highly aliased simulated environments where unresolved data association leads to multi-modal belief hypotheses.
arXiv Detail & Related papers (2022-11-14T20:16:51Z)
Continuous Monte Carlo Graph Search [61.11769232283621]
Continuous Monte Carlo Graph Search ( CMCGS) is an extension of Monte Carlo Tree Search (MCTS) to online planning. CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance. It can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models.
arXiv Detail & Related papers (2022-10-04T07:34:06Z)
Case Studies for Computing Density of Reachable States for Safe Autonomous Motion Planning [8.220217498103313]
Density of the reachable states can help understand the risk of safety-critical systems. Recent work provides a data-driven approach to compute the density distribution of autonomous systems' forward reachable states online. In this paper, we study the use of such approach in combination with model predictive control for verifiable safe path planning under uncertainties.
arXiv Detail & Related papers (2022-09-16T17:38:24Z)
Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds. We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z)
Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes. Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z)
Latent Bandits Revisited [55.88616813182679]
A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state. We propose general algorithms for this setting, based on both upper confidence bounds (UCBs) and Thompson sampling. We provide a unified theoretical analysis of our algorithms, which have lower regret than classic bandit policies when the number of latent states is smaller than actions.
arXiv Detail & Related papers (2020-06-15T19:24:02Z)
Safe Mission Planning under Dynamical Uncertainties [15.533842336139063]
This paper considers safe robot mission planning in uncertain dynamical environments. It is a challenging problem due to modeling and integrating dynamical uncertainties into a safe planning framework.
arXiv Detail & Related papers (2020-03-05T20:45:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.