Related papers: Reinforcement Learning for UAV control with Policy and Reward Shaping

Reinforcement Learning for UAV control with Policy and Reward Shaping

URL: http://arxiv.org/abs/2212.03828v1
Date: Tue, 6 Dec 2022 14:46:13 GMT
Title: Reinforcement Learning for UAV control with Policy and Reward Shaping
Authors: Cristian Mill\'an-Arias, Ruben Contreras, Francisco Cruz and Bruno Fernandes
Abstract summary: This study teaches an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously. The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach.
Score: 0.7127008801193563
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, unmanned aerial vehicle (UAV) related technology has expanded knowledge in the area, bringing to light new problems and challenges that require solutions. Furthermore, because the technology allows processes usually carried out by people to be automated, it is in great demand in industrial sectors. The automation of these vehicles has been addressed in the literature, applying different machine learning strategies. Reinforcement learning (RL) is an automation framework that is frequently used to train autonomous agents. RL is a machine learning paradigm wherein an agent interacts with an environment to solve a given task. However, learning autonomously can be time consuming, computationally expensive, and may not be practical in highly-complex scenarios. Interactive reinforcement learning allows an external trainer to provide advice to an agent while it is learning a task. In this study, we set out to teach an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously. Two simulated scenarios were proposed for the training; one without obstacles and one with obstacles. We also studied the influence of each technique. The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach. Nevertheless, the agent achieves lower execution times and less dispersion during training.

Related papers

Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode [8.017543518311196]
Reinforcement learning is not yet competitive for many cyber-physical systems. We train the reinforcement agent in a so-called shadow mode with the assistance of an existing conventional controller. In shadow mode, the agent relies on the controller to provide action samples and guidance towards favourable states to learn the task.
arXiv Detail & Related papers (2024-10-30T19:52:52Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents. We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead. Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z)
Self-Inspection Method of Unmanned Aerial Vehicles in Power Plants Using Deep Q-Network Reinforcement Learning [0.0]
The research proposes a power plant inspection system incorporating UAV autonomous navigation and DQN reinforcement learning. The trained model makes it more likely that the inspection strategy will be applied in practice by enabling the UAV to move around on its own in difficult environments.
arXiv Detail & Related papers (2023-03-16T00:58:50Z)
Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse Skills [28.39150937658635]
We propose a post-hoc technique for policy fusion using Optimal Transport theory. This provides an improved weights initialisation of the neural network policy for learning new tasks. Our results show that specialised knowledge can be unified into a "Renaissance agent", allowing for quicker learning of new skills.
arXiv Detail & Related papers (2022-07-03T08:15:41Z)
Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II [0.5911087507716211]
In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors. Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
arXiv Detail & Related papers (2022-05-11T21:53:11Z)
Coach-assisted Multi-Agent Reinforcement Learning Framework for Unexpected Crashed Agents [120.91291581594773]
We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes. We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training. To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
arXiv Detail & Related papers (2022-03-16T08:22:45Z)
Automating Privilege Escalation with Deep Reinforcement Learning [71.87228372303453]
In this work, we exemplify the potential threat of malicious actors using deep reinforcement learning to train automated agents. We present an agent that uses a state-of-the-art reinforcement learning algorithm to perform local privilege escalation. Our agent is usable for generating realistic attack sensor data for training and evaluating intrusion detection systems.
arXiv Detail & Related papers (2021-10-04T12:20:46Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks [70.56451186797436]
We study how to use meta-reinforcement learning to solve the bulk of the problem in simulation. We demonstrate our approach by training an agent to successfully perform challenging real-world insertion tasks.
arXiv Detail & Related papers (2020-04-29T18:00:22Z)
Deep Adversarial Reinforcement Learning for Object Disentangling [36.66974848126079]
We present a novel adversarial reinforcement learning (ARL) framework for disentangling waste objects. The ARL framework utilizes an adversary, which is trained to steer the original agent, the protagonist, to challenging states. We show that our method can generalize from training to test scenarios by training an end-to-end system for robot control to solve a challenging object disentangling task.
arXiv Detail & Related papers (2020-03-08T13:20:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.