Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2306.16978v4
- Date: Fri, 7 Jun 2024 08:39:38 GMT
- Title: Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
- Authors: Arvi Jonnarth, Jie Zhao, Michael Felsberg,
- Abstract summary: Coverage path planning ( CPP) is the problem of finding a path that covers the entire free space of a confined area.
We investigate how suitable reinforcement learning is for this challenging problem.
We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation.
- Score: 17.69984142788365
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. When the environment is unknown, the path needs to be planned online while mapping the environment, which cannot be addressed by offline planning methods that do not allow for a flexible path space. We investigate how suitable reinforcement learning is for this challenging problem, and analyze the involved components required to efficiently learn coverage paths, such as action space, input feature representation, neural network architecture, and reward function. We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation to promote complete coverage. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations.
Related papers
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.
Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.
We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.
This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Learning to Recharge: UAV Coverage Path Planning through Deep
Reinforcement Learning [5.475990395948956]
Coverage path planning ( CPP) is a critical problem in robotics, where the goal is to find an efficient path that covers every point in an area of interest.
This work addresses the power-constrained CPP problem with recharge for battery-limited unmanned aerial vehicles (UAVs)
We propose a novel proximal policy optimization (PPO)-based deep reinforcement learning (DRL) approach with map-based observations.
arXiv Detail & Related papers (2023-09-06T16:55:11Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Offline Stochastic Shortest Path: Learning, Evaluation and Towards
Optimality [57.91411772725183]
In this paper, we consider the offline shortest path problem when the state space and the action space are finite.
We design the simple value-based algorithms for tackling both offline policy evaluation (OPE) and offline policy learning tasks.
Our analysis of these simple algorithms yields strong instance-dependent bounds which can imply worst-case bounds that are near-minimax optimal.
arXiv Detail & Related papers (2022-06-10T07:44:56Z) - Neural Motion Planning for Autonomous Parking [6.1805402105389895]
This paper presents a hybrid motion planning strategy that combines a deep generative network with a conventional motion planning method.
The proposed method effectively learns the representations of a given state, and shows improvement in terms of algorithm performance.
arXiv Detail & Related papers (2021-11-12T14:29:38Z) - Reinforcement Learning-Based Coverage Path Planning with Implicit
Cellular Decomposition [5.2424255020469595]
This paper provides a systematic analysis of the coverage problem and formulates it as an optimal stopping time problem.
We show that reinforcement learning-based algorithms efficiently cover realistic unknown indoor environments.
arXiv Detail & Related papers (2021-10-18T05:18:52Z) - Adaptive Informative Path Planning Using Deep Reinforcement Learning for
UAV-based Active Sensing [2.6519061087638014]
We propose a new approach for informative path planning based on deep reinforcement learning (RL)
Our method combines Monte Carlo tree search with an offline-learned neural network predicting informative sensing actions.
By deploying the trained network during a mission, our method enables sample-efficient online replanning on physical platforms with limited computational resources.
arXiv Detail & Related papers (2021-09-28T09:00:55Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - Scalable Bayesian Inverse Reinforcement Learning [93.27920030279586]
We introduce Approximate Variational Reward Imitation Learning (AVRIL)
Our method addresses the ill-posed nature of the inverse reinforcement learning problem.
Applying our method to real medical data alongside classic control simulations, we demonstrate Bayesian reward inference in environments beyond the scope of current methods.
arXiv Detail & Related papers (2021-02-12T12:32:02Z) - Flexible and Efficient Long-Range Planning Through Curious Exploration [13.260508939271764]
We show that the Curious Sample Planner can efficiently discover temporally-extended plans for solving a wide range of physically realistic 3D tasks.
In contrast, standard planning and learning methods often fail to solve these tasks at all or do so only with a huge and highly variable number of training samples.
arXiv Detail & Related papers (2020-04-22T21:47:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.