Related papers: Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces

Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces

URL: http://arxiv.org/abs/2106.04206v1
Date: Tue, 8 Jun 2021 09:31:48 GMT
Title: Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces
Authors: \"Omer \c{S}ahin Ta\c{s}, Felix Hauser, Martin Lauer
Abstract summary: Decision making under uncertainty can be framed as a partially observable Markov decision process (POMDP) Finding exact solutions of POMDPs is generally intractable, but the solution can be approximated by sampling-based approaches. We demonstrate the effectiveness of this approach in the context of motion planning for automated driving.
Score: 5.732271870257913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision making under uncertainty can be framed as a partially observable Markov decision process (POMDP). Finding exact solutions of POMDPs is generally computationally intractable, but the solution can be approximated by sampling-based approaches. These sampling-based POMDP solvers rely on multi-armed bandit (MAB) heuristics, which assume the outcomes of different actions to be uncorrelated. In some applications, like motion planning in continuous spaces, similar actions yield similar outcomes. In this paper, we utilize variants of MAB heuristics that make Lipschitz continuity assumptions on the outcomes of actions to improve the efficiency of sampling-based planning approaches. We demonstrate the effectiveness of this approach in the context of motion planning for automated driving.

Related papers

Scalable Solution Methods for Dec-POMDPs with Deterministic Dynamics [20.560809517043904]
We introduce the class of Deterministic Decentralized POMDPs (Det-Dec-POMDPs)<n>This is a subclass of Dec-POMDPs characterized by deterministic transitions and observations conditioned on the state and joint actions.<n>We then propose a practical solver called Iterative Deterministic POMDP Planning (IDPP)
arXiv Detail & Related papers (2025-08-29T12:50:10Z)
Efficient Solution and Learning of Robust Factored MDPs [57.2416302384766]
Learning r-MDPs from interactions with an unknown environment enables the synthesis of robust policies with provable guarantees on performance.<n>We propose novel methods for solving and learning r-MDPs based on factored state representations.
arXiv Detail & Related papers (2025-08-01T15:23:15Z)
Robust Counterfactual Inference in Markov Decision Processes [1.5197843979051473]
Current approaches assume a specific causal model to make counterfactuals identifiable. We propose a novel non-parametric approach that computes tight bounds on counterfactual transition probabilities.
arXiv Detail & Related papers (2025-02-19T13:56:20Z)
Equivariant Action Sampling for Reinforcement Learning and Planning [24.423370812881153]
Reinforcement learning algorithms for continuous control tasks require accurate sampling-based action selection. This work addresses the challenge of preserving symmetry in sampling-based planning and control. We introduce an action sampling approach that enforces the desired symmetry.
arXiv Detail & Related papers (2024-12-16T17:51:14Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs) This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models. In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z)
End-to-End Policy Gradient Method for POMDPs and Explainable Agents [2.1700203922407493]
We propose an RL algorithm that estimates the hidden states by end-to-end training, and visualize the estimation as a state-transition graph. Experimental results demonstrated that the proposed algorithm can solve simple POMDP problems and that the visualization makes the agent's behavior interpretable to humans.
arXiv Detail & Related papers (2023-04-19T15:45:52Z)
Evaluating Guiding Spaces for Motion Planning [2.384084215091134]
We define the emphmotion planning guiding space, which encapsulates many seemingly distinct prior works under the same framework. We also suggest an information theoretic method to evaluate guided planning which places the focus on the quality of the resulting biased sampling.
arXiv Detail & Related papers (2022-10-16T21:17:51Z)
Optimality Guarantees for Particle Belief Approximation of POMDPs [55.83001584645448]
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid. We propose a theory characterizing the approximation error of the particle filtering techniques that these algorithms use.
arXiv Detail & Related papers (2022-10-10T21:11:55Z)
Expert-Guided Symmetry Detection in Markov Decision Processes [0.0]
We propose a paradigm that aims to detect the presence of some transformations of the state-action space for which the MDP dynamics is invariant. The results show that the model distributional shift is reduced when the dataset is augmented with the data obtained by using the detected symmetries.
arXiv Detail & Related papers (2021-11-19T16:12:30Z)
On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference [5.517104116168873]
We propose solving the general Decision Shortest-Path Markov Process (SSP MDP) as probabilistic inference. We discuss online and offline methods for planning under uncertainty.
arXiv Detail & Related papers (2021-09-13T11:07:52Z)
Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems. Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions. We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)
A maximum-entropy approach to off-policy evaluation in average-reward MDPs [54.967872716145656]
This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs) We provide the first finite-sample OPE error bound, extending existing results beyond the episodic and discounted cases. We show that this results in an exponential-family distribution whose sufficient statistics are the features, paralleling maximum-entropy approaches in supervised learning.
arXiv Detail & Related papers (2020-06-17T18:13:37Z)
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity [48.98199700043158]
We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process. We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability.
arXiv Detail & Related papers (2020-06-10T15:05:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.