Observation Adaptation via Annealed Importance Resampling for Partially Observable Markov Decision Processes
- URL: http://arxiv.org/abs/2503.19302v1
- Date: Tue, 25 Mar 2025 03:05:00 GMT
- Title: Observation Adaptation via Annealed Importance Resampling for Partially Observable Markov Decision Processes
- Authors: Yunuo Zhang, Baiting Luo, Ayan Mukhopadhyay, Abhishek Dubey,
- Abstract summary: Partially observable Markov decision processes (POMDPs) are a general mathematical model for sequential decision-making in environments under state uncertainty.<n>Online solvers typically use bootstrap particle filters based on importance resampling for updating the belief distribution.<n>We propose an approach that constructs a sequence of bridge distributions between the state-transition and optimal distributions through iterative Monte Carlo steps.
- Score: 4.830416359005018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Partially observable Markov decision processes (POMDPs) are a general mathematical model for sequential decision-making in stochastic environments under state uncertainty. POMDPs are often solved \textit{online}, which enables the algorithm to adapt to new information in real time. Online solvers typically use bootstrap particle filters based on importance resampling for updating the belief distribution. Since directly sampling from the ideal state distribution given the latest observation and previous state is infeasible, particle filters approximate the posterior belief distribution by propagating states and adjusting weights through prediction and resampling steps. However, in practice, the importance resampling technique often leads to particle degeneracy and sample impoverishment when the state transition model poorly aligns with the posterior belief distribution, especially when the received observation is highly informative. We propose an approach that constructs a sequence of bridge distributions between the state-transition and optimal distributions through iterative Monte Carlo steps, better accommodating noisy observations in online POMDP solvers. Our algorithm demonstrates significantly superior performance compared to state-of-the-art methods when evaluated across multiple challenging POMDP domains.
Related papers
- Annealed Stein Variational Gradient Descent for Improved Uncertainty Estimation in Full-Waveform Inversion [25.714206592953545]
Variational Inference (VI) provides an approximate solution to the posterior distribution in the form of a parametric or non-parametric proposal distribution.
This study aims to improve the performance of VI within the context of Full-Waveform Inversion.
arXiv Detail & Related papers (2024-10-17T06:15:26Z) - Inferring biological processes with intrinsic noise from cross-sectional data [0.8192907805418583]
Inferring dynamical models from data continues to be a significant challenge in computational biology.
We show that probability flow inference (PFI) disentangles force from intrinsicity while retaining the algorithmic ease of ODE inference.
In practical applications, we show that PFI enables accurate parameter and force estimation in high-dimensional reaction networks, and that it allows inference of cell differentiation dynamics with molecular noise.
arXiv Detail & Related papers (2024-10-10T00:33:25Z) - Spatially-Aware Diffusion Models with Cross-Attention for Global Field Reconstruction with Sparse Observations [1.371691382573869]
We develop and enhance score-based diffusion models in field reconstruction tasks.
We introduce a condition encoding approach to construct a tractable mapping mapping between observed and unobserved regions.
We demonstrate the ability of the model to capture possible reconstructions and improve the accuracy of fused results.
arXiv Detail & Related papers (2024-08-30T19:46:23Z) - Persistent Sampling: Enhancing the Efficiency of Sequential Monte Carlo [0.0]
Sequential Monte Carlo (SMC) samplers are powerful tools for Bayesian inference but suffer from high computational costs.<n>We introduce persistent sampling (PS), which retains SMC and constructs particles from all prior iterations.
arXiv Detail & Related papers (2024-07-30T10:34:40Z) - Nonlinear Filtering with Brenier Optimal Transport Maps [4.745059103971596]
This paper is concerned with the problem of nonlinear filtering, i.e., computing the conditional distribution of the state of a dynamical system.
Conventional sequential importance resampling (SIR) particle filters suffer from fundamental limitations, in scenarios involving degenerate likelihoods or high-dimensional states.
In this paper, we explore an alternative method, which is based on estimating the Brenier optimal transport (OT) map from the current prior distribution of the state to the posterior distribution at the next time step.
arXiv Detail & Related papers (2023-10-21T01:34:30Z) - Reverse Diffusion Monte Carlo [19.35592726471155]
We propose a novel Monte Carlo sampling algorithm called reverse diffusion Monte Carlo (rdMC)
rdMC is distinct from the Markov chain Monte Carlo (MCMC) methods.
arXiv Detail & Related papers (2023-07-05T05:42:03Z) - Optimality Guarantees for Particle Belief Approximation of POMDPs [55.83001584645448]
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems.
POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid.
We propose a theory characterizing the approximation error of the particle filtering techniques that these algorithms use.
arXiv Detail & Related papers (2022-10-10T21:11:55Z) - Computational Doob's h-transforms for Online Filtering of Discretely
Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms.
The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Variational Inference for Continuous-Time Switching Dynamical Systems [29.984955043675157]
We present a model based on an Markov jump process modulating a subordinated diffusion process.
We develop a new continuous-time variational inference algorithm.
We extensively evaluate our algorithm under the model assumption and for real-world examples.
arXiv Detail & Related papers (2021-09-29T15:19:51Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Batch Stationary Distribution Estimation [98.18201132095066]
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions.
We propose a consistent estimator that is based on recovering a correction ratio function over the given data.
arXiv Detail & Related papers (2020-03-02T09:10:01Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.