A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle
Avoidance
- URL: http://arxiv.org/abs/2103.06403v1
- Date: Thu, 11 Mar 2021 01:15:26 GMT
- Title: A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle
Avoidance
- Authors: Jeremy Roghair, Kyungtae Ko, Amir Ehsan Niaraki Asli and Ali Jannesari
- Abstract summary: We present two techniques for improving exploration for UAV obstacle avoidance.
The first is a convergence-based approach that uses convergence error to iterate through unexplored actions and temporal threshold to balance exploration and exploitation.
The second is a guidance-based approach which uses a Gaussian mixture distribution to compare previously seen states to a predicted next state in order to select the next action.
- Score: 1.2693545159861856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Integration of reinforcement learning with unmanned aerial vehicles (UAVs) to
achieve autonomous flight has been an active research area in recent years. An
important part focuses on obstacle detection and avoidance for UAVs navigating
through an environment. Exploration in an unseen environment can be tackled
with Deep Q-Network (DQN). However, value exploration with uniform sampling of
actions may lead to redundant states, where often the environments inherently
bear sparse rewards. To resolve this, we present two techniques for improving
exploration for UAV obstacle avoidance. The first is a convergence-based
approach that uses convergence error to iterate through unexplored actions and
temporal threshold to balance exploration and exploitation. The second is a
guidance-based approach using a Domain Network which uses a Gaussian mixture
distribution to compare previously seen states to a predicted next state in
order to select the next action. Performance and evaluation of these approaches
were implemented in multiple 3-D simulation environments, with variation in
complexity. The proposed approach demonstrates a two-fold improvement in
average rewards compared to state of the art.
Related papers
- Shrinking POMCP: A Framework for Real-Time UAV Search and Rescue [10.399964979693996]
We present a comprehensive approach to optimize UAV-based search and rescue operations in neighborhood areas.
The path planning problem is formulated as a partially observable Markov decision process (POMDP)
We propose a novel Shrinking POMCP'' approach to address time constraints.
arXiv Detail & Related papers (2024-11-20T01:41:29Z) - On-policy Actor-Critic Reinforcement Learning for Multi-UAV Exploration [0.7373617024876724]
Unmanned aerial vehicles (UAVs) have become increasingly popular in various fields, including precision agriculture, search and rescue, and remote sensing.
This study aims to address this challenge by utilizing on-policy Reinforcement Learning (RL) with Proximal Policy Optimization (PPO) to explore the two dimensional area of interest with multiple UAVs.
The proposed solution includes actor-critic networks using deep convolutional neural networks (CNN) and long short-term memory (LSTM) for identifying the UAVs and areas that have already been covered.
arXiv Detail & Related papers (2024-09-17T10:36:46Z) - RaCIL: Ray Tracing based Multi-UAV Obstacle Avoidance through Composite Imitation Learning [1.934627691560021]
We address the challenge of obstacle avoidance for Unmanned Aerial Vehicles (UAVs) through an innovative imitation learning approach.
Our research underscores the significant role of ray-tracing in enhancing obstacle detection and avoidance capabilities.
Our approach paves the way for advanced autonomous UAV operations in crowded or dynamic environments.
arXiv Detail & Related papers (2024-06-24T17:43:24Z) - UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs.
We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z) - Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs)
Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV.
Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - UN-AVOIDS: Unsupervised and Nonparametric Approach for Visualizing
Outliers and Invariant Detection Scoring [2.578242050187029]
UN-AVOIDS is an unsupervised and nonparametric approach for both visualization (a human process) and detection (an algorithmic process) of outliers.
It transforms data into a new space, which is introduced in this paper as neighborhood cumulative density function (NCDF)
In terms of AUC, UN-AVOIDS was almost an overall winner.
arXiv Detail & Related papers (2021-11-19T02:31:06Z) - Trajectory Design for UAV-Based Internet-of-Things Data Collection: A
Deep Reinforcement Learning Approach [93.67588414950656]
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a 3D environment.
We present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm.
Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
arXiv Detail & Related papers (2021-07-23T03:33:29Z) - Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction [71.97877759413272]
Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions.
Recent methods have achieved strong performances using Multi-Choice Learning objectives like winner-takes-all (WTA) or best-of-many.
Our work addresses two key challenges in trajectory prediction, learning outputs, and better predictions by imposing constraints using driving knowledge.
arXiv Detail & Related papers (2021-04-16T17:58:56Z) - Autonomous and cooperative design of the monitor positions for a team of
UAVs to maximize the quantity and quality of detected objects [0.5801044612920815]
This paper tackles the problem of positioning a swarm of UAVs inside a completely unknown terrain.
YOLOv3 and a system to identify duplicate objects of interest were employed to assign a single score to each UAVs' configuration.
A novel navigation algorithm, capable of optimizing the previously defined score, is proposed.
arXiv Detail & Related papers (2020-07-02T16:52:57Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.