Related papers: A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles

A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles

URL: http://arxiv.org/abs/2301.06863v1
Date: Tue, 17 Jan 2023 13:16:16 GMT
Title: A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles
Authors: Ivan Masmitja, Mario Martin, Kakani Katija, Spartacus Gomariz, Joan Navarro
Abstract summary: In ROSB target localization methods, the trajectory of the tracking vehicle near the localized target plays an important role in obtaining the best accuracy of the predicted target position. Here, we investigate a Reinforcement Learning (RL) approach to find the optimal path that an autonomous vehicle should follow in order to increase and optimize the overall accuracy of the predicted target localization.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Underwater target localization using range-only and single-beacon (ROSB) techniques with autonomous vehicles has been used recently to improve the limitations of more complex methods, such as long baseline and ultra-short baseline systems. Nonetheless, in ROSB target localization methods, the trajectory of the tracking vehicle near the localized target plays an important role in obtaining the best accuracy of the predicted target position. Here, we investigate a Reinforcement Learning (RL) approach to find the optimal path that an autonomous vehicle should follow in order to increase and optimize the overall accuracy of the predicted target localization, while reducing time and power consumption. To accomplish this objective, different experimental tests have been designed using state-of-the-art deep RL algorithms. Our study also compares the results obtained with the analytical Fisher information matrix approach used in previous studies. The results revealed that the policy learned by the RL agent outperforms trajectories based on these analytical solutions, e.g. the median predicted error at the beginning of the target's localisation is 17% less. These findings suggest that using deep RL for localizing acoustic targets could be successfully applied to in-water applications that include tracking of acoustically tagged marine animals by autonomous underwater vehicles. This is envisioned as a first necessary step to validate the use of RL to tackle such problems, which could be used later on in a more complex scenarios

Related papers

Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better [63.567886330598945]
Infrared small target (IRST) detection is challenging in simultaneously achieving precise, universal, robust and efficient performance.<n>Current learning-based methods attempt to leverage more" information from both the spatial and the short-term temporal domains.<n>We propose an efficient deep temporal probe network (DeepPro) that only performs calculations in the time dimension for IRST detection.
arXiv Detail & Related papers (2025-06-15T08:19:32Z)
Adaptive Target Localization under Uncertainty using Multi-Agent Deep Reinforcement Learning with Knowledge Transfer [15.605693371392212]
This work proposes a novel MADRL-based method for target localization in uncertain environments. The observations of the agents are designed in an optimized manner to capture essential information in the environment. A Deep Learning model builds on the knowledge from the MADRL model to accurately estimating the target location if it is unreachable.
arXiv Detail & Related papers (2025-01-19T02:58:22Z)
Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms. We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths. We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z)
Integrating Saliency Ranking and Reinforcement Learning for Enhanced Object Detection [0.08192907805418582]
This study explores a series of experiments that combine reinforcement learning (RL)-based visual attention methods with saliency ranking techniques. The best mean Average Precision (mAP) achieved in this study was 51.4, surpassing benchmarks set by RL-based single object detectors in the literature.
arXiv Detail & Related papers (2024-08-13T10:46:42Z)
Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making [6.243971093896272]
We use a continuous proximal policy optimization-based DRL algorithm as the baseline model and add a multi-head attention framework in an open-source AV simulation environment. We show that the weights in the first head encode the positions of the neighboring vehicles while the second head focuses on the leader vehicle exclusively.
arXiv Detail & Related papers (2024-03-18T02:59:13Z)
Data-efficient Deep Reinforcement Learning for Vehicle Trajectory Control [6.144517901919656]
Reinforcement learning (RL) promises to achieve control performance superior to classical approaches. Standard RL approaches like soft-actor critic (SAC) require extensive amounts of training data to be collected. We apply recently developed data-efficient deep RL methods to vehicle trajectory control.
arXiv Detail & Related papers (2023-11-30T09:38:59Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent. We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z)
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation. We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL) We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z)
Adversarial Intrinsic Motivation for Reinforcement Learning [60.322878138199364]
We investigate whether the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution can be utilized effectively for reinforcement learning tasks. Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function.
arXiv Detail & Related papers (2021-05-27T17:51:34Z)
Reinforcement Learning for Low-Thrust Trajectory Design of Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances. An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted. The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z)
Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection [36.79380276028116]
We study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection.
arXiv Detail & Related papers (2020-05-05T20:39:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.