Decentralized Reinforcement Learning for Multi-Target Search and
Detection by a Team of Drones
- URL: http://arxiv.org/abs/2103.09520v1
- Date: Wed, 17 Mar 2021 09:04:47 GMT
- Title: Decentralized Reinforcement Learning for Multi-Target Search and
Detection by a Team of Drones
- Authors: Roi Yehoshua, Juan Heredia-Juesas, Yushu Wu, Christopher Amato, Jose
Martinez-Lorenzo
- Abstract summary: Targets search and detection encompasses a variety of decision problems such as coverage, surveillance, search, observing and pursuit-evasion.
We develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area.
- Score: 12.055303570215335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Targets search and detection encompasses a variety of decision problems such
as coverage, surveillance, search, observing and pursuit-evasion along with
others. In this paper we develop a multi-agent deep reinforcement learning
(MADRL) method to coordinate a group of aerial vehicles (drones) for the
purpose of locating a set of static targets in an unknown area. To that end, we
have designed a realistic drone simulator that replicates the dynamics and
perturbations of a real experiment, including statistical inferences taken from
experimental data for its modeling. Our reinforcement learning method, which
utilized this simulator for training, was able to find near-optimal policies
for the drones. In contrast to other state-of-the-art MADRL methods, our method
is fully decentralized during both learning and execution, can handle
high-dimensional and continuous observation spaces, and does not require tuning
of additional hyperparameters.
Related papers
- Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions.
By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Federated Learning for Misbehaviour Detection with Variational Autoencoders and Gaussian Mixture Models [0.2999888908665658]
Federated Learning (FL) has become an attractive approach to collaboratively train Machine Learning (ML) models.
This work proposes a novel unsupervised FL approach for the identification of potential misbehavior in vehicular environments.
We leverage the computing capabilities of public cloud services for model aggregation purposes.
arXiv Detail & Related papers (2024-05-16T08:49:50Z) - Enhancing Robotic Navigation: An Evaluation of Single and
Multi-Objective Reinforcement Learning Strategies [0.9208007322096532]
This study presents a comparative analysis between single-objective and multi-objective reinforcement learning methods for training a robot to navigate effectively to an end goal.
By modifying the reward function to return a vector of rewards, each pertaining to a distinct objective, the robot learns a policy that effectively balances the different goals.
arXiv Detail & Related papers (2023-12-13T08:00:26Z) - Distributed multi-agent target search and tracking with Gaussian process
and reinforcement learning [26.499110405106812]
We propose a multi-agent reinforcement learning technique with target map building based on distributed process.
We evaluate the performance and transferability of the trained policy in simulation and demonstrate the method on a swarm of micro unmanned aerial vehicles.
arXiv Detail & Related papers (2023-08-29T01:53:14Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Aerial View Goal Localization with Reinforcement Learning [6.165163123577484]
We present a framework that emulates a search-and-rescue (SAR)-like setup without requiring access to actual UAVs.
In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues.
We propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals)
arXiv Detail & Related papers (2022-09-08T10:27:53Z) - Space Non-cooperative Object Active Tracking with Deep Reinforcement
Learning [1.212848031108815]
We propose an end-to-end active visual tracking method based on DQN algorithm, named as DRLAVT.
It can guide the chasing spacecraft approach to arbitrary space non-cooperative target merely relied on color or RGBD images.
It significantly outperforms position-based visual servoing baseline algorithm that adopts state-of-the-art 2D monocular tracker, SiamRPN.
arXiv Detail & Related papers (2021-12-18T06:12:24Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.