GPU-accelerated Optimal Path Planning in Stochastic Dynamic Environments
- URL: http://arxiv.org/abs/2109.00857v1
- Date: Thu, 2 Sep 2021 12:14:34 GMT
- Title: GPU-accelerated Optimal Path Planning in Stochastic Dynamic Environments
- Authors: Rohit Chowdhury, Deepak Subramani
- Abstract summary: Planning time and energy optimal paths for autonomous marine vehicles is essential to reduce operational costs.
Decision Processes (MDPs) provide a natural framework for sequential decision-making for robotic agents in such environments.
We introduce an efficient end-to-end-accelerated algorithm that builds the MDP model and solves the MDP to compute an optimal policy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Autonomous marine vehicles play an essential role in many ocean science and
engineering applications. Planning time and energy optimal paths for these
vehicles to navigate in stochastic dynamic ocean environments is essential to
reduce operational costs. In some missions, they must also harvest solar, wind,
or wave energy (modeled as a stochastic scalar field) and move in optimal paths
that minimize net energy consumption. Markov Decision Processes (MDPs) provide
a natural framework for sequential decision-making for robotic agents in such
environments. However, building a realistic model and solving the modeled MDP
becomes computationally expensive in large-scale real-time applications,
warranting the need for parallel algorithms and efficient implementation. In
the present work, we introduce an efficient end-to-end GPU-accelerated
algorithm that (i) builds the MDP model (computing transition probabilities and
expected one-step rewards); and (ii) solves the MDP to compute an optimal
policy. We develop methodical and algorithmic solutions to overcome the limited
global memory of GPUs by (i) using a dynamic reduced-order representation of
the ocean flows, (ii) leveraging the sparse nature of the state transition
probability matrix, (iii) introducing a neighbouring sub-grid concept and (iv)
proving that it is sufficient to use only the stochastic scalar field's mean to
compute the expected one-step rewards for missions involving energy harvesting
from the environment; thereby saving memory and reducing the computational
effort. We demonstrate the algorithm on a simulated stochastic dynamic
environment and highlight that it builds the MDP model and computes the optimal
policy 600-1000x faster than conventional CPU implementations, making it
suitable for real-time use.
Related papers
- Metamizer: a versatile neural optimizer for fast and accurate physics simulations [4.717325308876749]
We introduce Metamizer, a novel neural network that iteratively solves a wide range of physical systems with high accuracy.
We demonstrate that Metamizer achieves unprecedented accuracy for deep learning based approaches.
Our results suggest that Metamizer could have a profound impact on future numerical solvers.
arXiv Detail & Related papers (2024-10-10T11:54:31Z) - Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks [14.835081385422653]
We develop a novel network architecture, termed Input Convex Lipschitz Recurrent Neural Networks.
This model is explicitly designed for fast and robust optimization-based tasks.
We have successfully implemented this model in various practical engineering applications.
arXiv Detail & Related papers (2024-01-15T06:26:53Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - DDPEN: Trajectory Optimisation With Sub Goal Generation Model [70.36888514074022]
In this paper, we produce a novel Differential Dynamic Programming with Escape Network (DDPEN)
We propose to utilize a deep model that takes as an input map of the environment in the form of a costmap together with the desired position.
The model produces possible future directions that will lead to the goal, avoiding local minima which is possible to run in real time conditions.
arXiv Detail & Related papers (2023-01-18T11:02:06Z) - Neural Stochastic Dual Dynamic Programming [99.80617899593526]
We introduce a trainable neural model that learns to map problem instances to a piece-wise linear value function.
$nu$-SDDP can significantly reduce problem solving cost without sacrificing solution quality.
arXiv Detail & Related papers (2021-12-01T22:55:23Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - MPC-MPNet: Model-Predictive Motion Planning Networks for Fast,
Near-Optimal Planning under Kinodynamic Constraints [15.608546987158613]
Kinodynamic Motion Planning (KMP) is computation to find a robot motion subject to concurrent kinematics and dynamics constraints.
We present a scalable, imitation learning-based, Model-Predictive Motion Planning Networks framework that finds near-optimal path solutions.
We evaluate our algorithms on a range of cluttered, kinodynamically constrained, and underactuated planning problems with results indicating significant improvements in times, path qualities, and success rates over existing methods.
arXiv Detail & Related papers (2021-01-17T23:07:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.