Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
- URL: http://arxiv.org/abs/2406.04920v3
- Date: Sat, 23 Aug 2025 14:20:46 GMT
- Title: Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
- Authors: Arvi Jonnarth, Ola Johansson, Jie Zhao, Michael Felsberg,
- Abstract summary: Coverage path planning is the problem of finding a path that covers the entire free space of a confined area.<n>We investigate the suitability of continuous-space reinforcement learning for this challenging problem.<n>We show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods.
- Score: 22.077058792635313
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. While for known environments, offline methods can find provably complete paths, and in some cases optimal solutions, unknown environments need to be planned online during mapping. We investigate the suitability of continuous-space reinforcement learning (RL) for this challenging problem, and propose a computationally feasible egocentric map representation based on frontiers, as well as a novel reward term based on total variation to promote complete coverage. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment characteristics. Meanwhile, the deployment of RL models on real robot systems is difficult. Training from scratch may be infeasible due to slow convergence times, while transferring from simulation to reality, i.e. sim-to-real transfer, is a key challenge in itself. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations in simulation. Meanwhile, our method successfully transfers to a real robot. Our code implementation can be found online.
Related papers
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis [94.33978856270268]
Retrieval-augmented generation (RAG) systems have advanced large language models (LLMs) in complex deep search scenarios.<n>Existing approaches face critical limitations that lack high-quality training trajectories and suffer from distributional mismatches.<n>This paper introduces SimpleDeepSearcher, a framework that bridges the gap through strategic data engineering rather than complex training paradigms.
arXiv Detail & Related papers (2025-05-22T16:05:02Z) - A General Infrastructure and Workflow for Quadrotor Deep Reinforcement Learning and Reality Deployment [48.90852123901697]
We propose a platform that enables seamless transfer of end-to-end deep reinforcement learning (DRL) policies to quadrotors.<n>Our platform provides rich types of environments including hovering, dynamic obstacle avoidance, trajectory tracking, balloon hitting, and planning in unknown environments.
arXiv Detail & Related papers (2025-04-21T14:25:23Z) - Neural Fidelity Calibration for Informative Sim-to-Real Adaptation [10.117298045153564]
Deep reinforcement learning can seamlessly transfer agile locomotion and navigation skills from the simulator to real world.
However, bridging the sim-to-real gap with domain randomization or adversarial methods often demands expert physics knowledge to ensure policy robustness.
We propose Neural Fidelity (NFC), a novel framework that employs conditional score-based diffusion models to calibrate simulator physical coefficients and residual fidelity domains online during robot execution.
arXiv Detail & Related papers (2025-04-11T15:12:12Z) - ReGentS: Real-World Safety-Critical Driving Scenario Generation Made Stable [88.08120417169971]
Machine learning based autonomous driving systems often face challenges with safety-critical scenarios that are rare in real-world data.
This work explores generating safety-critical driving scenarios by modifying complex real-world regular scenarios through trajectory optimization.
Our approach addresses unrealistic diverging trajectories and unavoidable collision scenarios that are not useful for training robust planner.
arXiv Detail & Related papers (2024-09-12T08:26:33Z) - Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor.
In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning [17.69984142788365]
Coverage path planning ( CPP) is the problem of finding a path that covers the entire free space of a confined area.
We investigate how suitable reinforcement learning is for this challenging problem.
We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation.
arXiv Detail & Related papers (2023-06-29T14:32:06Z) - Zero-shot Sim2Real Adaptation Across Environments [45.44896435487879]
We propose a Reverse Action Transformation (RAT) policy which learns to imitate simulated policies in the real-world.
RAT can then be deployed on top of a Universal Policy Network to achieve zero-shot adaptation to new environments.
arXiv Detail & Related papers (2023-02-08T11:59:07Z) - Discrete Control in Real-World Driving Environments using Deep
Reinforcement Learning [2.467408627377504]
We introduce a framework (perception, planning, and control) in a real-world driving environment that transfers the real-world environments into gaming environments.
We propose variations of existing Reinforcement Learning (RL) algorithms in a multi-agent setting to learn and execute the discrete control in real-world environments.
arXiv Detail & Related papers (2022-11-29T04:24:03Z) - Provable Sim-to-real Transfer in Continuous Domain with Partial
Observations [39.18274543757048]
Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world.
We show that a popular robust adversarial training algorithm is capable of learning a policy from the simulated environment that is competitive to the optimal policy in the real-world environment.
arXiv Detail & Related papers (2022-10-27T16:37:52Z) - Auto-Tuned Sim-to-Real Transfer [143.44593793640814]
Policies trained in simulation often fail when transferred to the real world.
Current approaches to tackle this problem, such as domain randomization, require prior knowledge and engineering.
We propose a method for automatically tuning simulator system parameters to match the real world.
arXiv Detail & Related papers (2021-04-15T17:59:55Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z) - RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real [74.45688231140689]
We introduce the RL-scene consistency loss for image translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
We obtain RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning.
arXiv Detail & Related papers (2020-06-16T08:58:07Z) - Sim-to-Real Transfer with Incremental Environment Complexity for
Reinforcement Learning of Depth-Based Robot Navigation [1.290382979353427]
Soft-Actor Critic (SAC) training strategy using incremental environment complexity is proposed to drastically reduce the need for additional training in the real world.
The application addressed is depth-based mapless navigation, where a mobile robot should reach a given waypoint in a cluttered environment with no prior mapping information.
arXiv Detail & Related papers (2020-04-30T10:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.