Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying
Partially Observable Environment
- URL: http://arxiv.org/abs/2312.03263v3
- Date: Thu, 7 Mar 2024 22:42:20 GMT
- Title: Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying
Partially Observable Environment
- Authors: Gokul Puthumanaillam, Xiangyu Liu, Negar Mehr and Melkior Ornik
- Abstract summary: Environmental variability over time can significantly impact the system's optimal decision making strategy.
We propose a two-pronged approach to accurately estimate and plan within the TV-POMDP.
We validate the proposed framework and algorithms using simulations and robots.
- Score: 14.646280719661465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal decision-making presents a significant challenge for autonomous
systems operating in uncertain, stochastic and time-varying environments.
Environmental variability over time can significantly impact the system's
optimal decision making strategy for mission completion. To model such
environments, our work combines the previous notion of Time-Varying Markov
Decision Processes (TVMDP) with partial observability and introduces
Time-Varying Partially Observable Markov Decision Processes (TV-POMDP). We
propose a two-pronged approach to accurately estimate and plan within the
TV-POMDP: 1) Memory Prioritized State Estimation (MPSE), which leverages
weighted memory to provide more accurate time-varying transition estimates; and
2) an MPSE-integrated planning strategy that optimizes long-term rewards while
accounting for temporal constraint. We validate the proposed framework and
algorithms using simulations and hardware, with robots exploring a partially
observable, time-varying environments. Our results demonstrate superior
performance over standard methods, highlighting the framework's effectiveness
in stochastic, uncertain, time-varying domains.
Related papers
- Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Learning-assisted Stochastic Capacity Expansion Planning: A Bayesian Optimization Approach [3.124884279860061]
Large-scale capacity expansion problems (CEPs) are central to costeffective decarbonization of regional energy systems.
Here, we propose a learning-assisted approximate solution method to tractably solve two-stage CEPs.
We show that our approach yields an estimated cost savings of up to 3.8% in comparison to series aggregation approaches.
arXiv Detail & Related papers (2024-01-19T01:40:58Z) - Learning From Scenarios for Stochastic Repairable Scheduling [3.9948520633731026]
We show how decision-focused learning techniques based on smoothing can be adapted to a scheduling problem.
We include an experimental evaluation to investigate in which situations decision-focused learning outperforms the state of the art for such situations: scenario-based optimization.
arXiv Detail & Related papers (2023-12-06T13:32:17Z) - Constant-time Motion Planning with Anytime Refinement for Manipulation [17.543746580669662]
We propose an anytime refinement approach that works in combination with constant-time motion planners (CTMP) algorithms.
Our proposed framework, as it operates as a constant time algorithm, rapidly generates an initial solution within a user-defined time threshold.
functioning as an anytime algorithm, it iteratively refines the solution's quality within the allocated time budget.
arXiv Detail & Related papers (2023-11-01T20:40:10Z) - Score Matching-based Pseudolikelihood Estimation of Neural Marked
Spatio-Temporal Point Process with Uncertainty Quantification [59.81904428056924]
We introduce SMASH: a Score MAtching estimator for learning markedPs with uncertainty quantification.
Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of markedPs through score-matching.
The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification.
arXiv Detail & Related papers (2023-10-25T02:37:51Z) - Measuring the Stability of Process Outcome Predictions in Online
Settings [4.599862571197789]
This paper proposes an evaluation framework for assessing the stability of models for online predictive process monitoring.
The framework introduces four performance meta-measures: the frequency of significant performance drops, the magnitude of such drops, the recovery rate, and the volatility of performance.
The results demonstrate that these meta-measures facilitate the comparison and selection of predictive models for different risk-taking scenarios.
arXiv Detail & Related papers (2023-10-13T10:37:46Z) - Learning non-Markovian Decision-Making from State-only Sequences [57.20193609153983]
We develop a model-based imitation of state-only sequences with non-Markov Decision Process (nMDP)
We demonstrate the efficacy of the proposed method in a path planning task with non-Markovian constraints.
arXiv Detail & Related papers (2023-06-27T02:26:01Z) - Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints.
Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z) - ARISE: ApeRIodic SEmi-parametric Process for Efficient Markets without
Periodogram and Gaussianity Assumptions [91.3755431537592]
We present the ApeRI-miodic (ARISE) process for investigating efficient markets.
The ARISE process is formulated as an infinite-sum of some known processes and employs the aperiodic spectrum estimation.
In practice, we apply the ARISE function to identify the efficiency of real-world markets.
arXiv Detail & Related papers (2021-11-08T03:36:06Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Value of structural health information in partially observable
stochastic environments [0.0]
We introduce and study the theoretical and computational foundations of the Value of Information (VoI) and the Value of Structural Health Monitoring (VoSHM)
It is shown that a POMDP policy inherently leverages the notion of VoI to guide observational actions in an optimal way at every decision step.
arXiv Detail & Related papers (2019-12-28T22:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.