Related papers: XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

URL: http://arxiv.org/abs/2104.10818v1
Date: Thu, 22 Apr 2021 01:33:10 GMT
Title: XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees
Authors: Aaron M. Roth, Jing Liang, and Dinesh Manocha
Abstract summary: We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments. Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm. We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
Score: 55.9643422180256
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments with moving obstacles or targets. Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm. In order to increase the reliability and handle the failure cases of the expert policy, we combine with a policy extraction technique to transform the resulting policy into a decision tree format. The resulting decision tree has properties which we use to analyze and modify the policy and improve performance on navigation metrics including smoothness, frequency of oscillation, frequency of immobilization, and obstruction of target. We are able to modify the policy to address these imperfections without retraining, combining the learning power of deep learning with the control of domain-specific algorithms. We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.

Related papers

Research on Autonomous Robots Navigation based on Reinforcement Learning [13.559881645869632]
We use the Deep Q Network (DQN) and Proximal Policy Optimization (PPO) models to optimize the path planning and decision-making process. We have verified the effectiveness and robustness of these models in various complex scenarios.
arXiv Detail & Related papers (2024-07-02T00:44:06Z)
Deep Reinforcement Learning with Enhanced PPO for Safe Mobile Robot Navigation [0.6554326244334868]
This study investigates the application of deep reinforcement learning to train a mobile robot for autonomous navigation in a complex environment. The robot utilizes LiDAR sensor data and a deep neural network to generate control signals guiding it toward a specified target while avoiding obstacles.
arXiv Detail & Related papers (2024-05-25T15:08:36Z)
Robot path planning using deep reinforcement learning [0.0]
Reinforcement learning methods offer an alternative to map-free navigation tasks. Deep reinforcement learning agents are implemented for both the obstacle avoidance and the goal-oriented navigation task. An analysis of the changes in the behaviour and performance of the agents caused by modifications in the reward function is conducted.
arXiv Detail & Related papers (2023-02-17T20:08:59Z)
MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation [46.32001721656828]
We present Multiple Scenario Verifiable Reinforcement Learning via Policy Extraction (MSVIPER) MSVIPER learns an "expert" policy using any Reinforcement Learning (RL) technique involving learning a state-action mapping. We demonstrate that MSVIPER results in efficient decision trees and can accurately mimic the behavior of the expert policy.
arXiv Detail & Related papers (2022-09-19T15:12:53Z)
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments. We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance. Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z)
Teaching a Robot to Walk Using Reinforcement Learning [0.0]
reinforcement learning can train optimal walking policies with ease. We teach a simulated two-dimensional bipedal robot how to walk using the OpenAI Gym BipedalWalker-v3 environment. ARS resulted in a better trained robot, and produced an optimal policy which officially "solves" the BipedalWalker-v3 problem.
arXiv Detail & Related papers (2021-12-13T21:35:45Z)
SABER: Data-Driven Motion Planner for Autonomously Navigating Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal. We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints. recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution. A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z)
Meta-Gradient Reinforcement Learning with an Objective Discovered Online [54.15180335046361]
We propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network. Because the objective is discovered online, it can adapt to changes over time. On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency.
arXiv Detail & Related papers (2020-07-16T16:17:09Z)
Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm. The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.