Related papers: Toward 6-DOF Autonomous Underwater Vehicle Energy-Aware Position Control based on Deep Reinforcement Learning: Preliminary Results

Toward 6-DOF Autonomous Underwater Vehicle Energy-Aware Position Control based on Deep Reinforcement Learning: Preliminary Results

URL: http://arxiv.org/abs/2502.17742v1
Date: Tue, 25 Feb 2025 00:37:57 GMT
Title: Toward 6-DOF Autonomous Underwater Vehicle Energy-Aware Position Control based on Deep Reinforcement Learning: Preliminary Results
Authors: Gustavo Boré, Vicente Sufán, Sebastián Rodríguez-Martínez, Giancarlo Troni,
Abstract summary: This paper proposes a novel DRL-based approach for controlling holonomic 6-DOF AUVs using the Truncated Quantile Critics (TQC) algorithm.<n>It does not require manual tuning and directly feeds commands to the thrusters without prior knowledge of their configuration.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The use of autonomous underwater vehicles (AUVs) for surveying, mapping, and inspecting unexplored underwater areas plays a crucial role, where maneuverability and power efficiency are key factors for extending the use of these platforms, making six degrees of freedom (6-DOF) holonomic platforms essential tools. Although Proportional-Integral-Derivative (PID) and Model Predictive Control controllers are widely used in these applications, they often require accurate system knowledge, struggle with repeatability when facing payload or configuration changes, and can be time-consuming to fine-tune. While more advanced methods based on Deep Reinforcement Learning (DRL) have been proposed, they are typically limited to operating in fewer degrees of freedom. This paper proposes a novel DRL-based approach for controlling holonomic 6-DOF AUVs using the Truncated Quantile Critics (TQC) algorithm, which does not require manual tuning and directly feeds commands to the thrusters without prior knowledge of their configuration. Furthermore, it incorporates power consumption directly into the reward function. Simulation results show that the TQC High-Performance method achieves better performance to a fine-tuned PID controller when reaching a goal point, while the TQC Energy-Aware method demonstrates slightly lower performance but consumes 30% less power on average.

Related papers

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms [9.517327026260181]
We propose RLPP, a residual RL framework that enhances a Pure Pursuit controller with an RL-based residual.<n> RLPP improves lap times of the baseline controllers by up to 6.37 %, closing the gap to the State-of-the-Art methods by more than 52 %.<n> RLPP is made available as an open-source tool, encouraging further exploration and advancement in autonomous racing research.
arXiv Detail & Related papers (2025-01-28T21:48:18Z)
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning [61.10299147201369]
This paper introduces a novel autonomous RL approach, called DigiRL, for training in-the-wild device control agents. We build a scalable and parallelizable Android learning environment equipped with a VLM-based evaluator. We demonstrate the effectiveness of DigiRL using the Android-in-the-Wild dataset, where our 1.3B VLM trained with RL achieves a 49.5% absolute improvement.
arXiv Detail & Related papers (2024-06-14T17:49:55Z)
A comparison of RL-based and PID controllers for 6-DOF swimming robots: hybrid underwater object tracking [8.362739554991073]
We present an exploration and assessment of employing a centralized deep Q-network (DQN) controller as a substitute for PID controllers. Our primary focus centers on illustrating this transition with the specific case of underwater object tracking. Our experiments, conducted within a Unity-based simulator, validate the effectiveness of a centralized RL agent over separated PID controllers.
arXiv Detail & Related papers (2024-01-29T23:14:15Z)
Modelling, Positioning, and Deep Reinforcement Learning Path Tracking Control of Scaled Robotic Vehicles: Design and Experimental Validation [3.807917169053206]
Scaled robotic cars are commonly equipped with a hierarchical control acthiecture that includes tasks dedicated to vehicle state estimation and control. This paper covers both aspects by proposing (i) a federeted extended Kalman filter (FEKF) and (ii) a novel deep reinforcement learning (DRL) path tracking controller trained via an expert demonstrator. The experimentally validated model is used for (i) supporting the design of the FEKF and (ii) serving as a digital twin for training the proposed DRL-based path tracking algorithm.
arXiv Detail & Related papers (2024-01-10T14:40:53Z)
Sim-to-Real Transfer of Adaptive Control Parameters for AUV Stabilization under Current Disturbance [1.099532646524593]
This paper presents a novel approach, merging the Maximum Entropy Deep Reinforcement Learning framework with a classic model-based control architecture, to formulate an adaptive controller. Within this framework, we introduce a Sim-to-Real transfer strategy comprising the following components: a bio-inspired experience replay mechanism, an enhanced domain randomisation technique, and an evaluation protocol executed on a physical platform. Our experimental assessments demonstrate that this method effectively learns proficient policies from suboptimal simulated models of the AUV, resulting in control performance 3 times higher when transferred to a real-world vehicle.
arXiv Detail & Related papers (2023-10-17T08:46:56Z)
CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation. CCE is based on a novel relationship we provide between gradient estimation and policy entropy. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z)
Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic Actuator [56.11574814802912]
State-of-the art robotic applications utilize series elastic actuators (SEAs) with closed-loop force control to achieve complex tasks such as walking, lifting, and manipulation. Model-free PID control methods are more prone to instability due to nonlinearities in the SEA. Deep reinforcement learning has proved to be an effective model-free method for continuous control tasks.
arXiv Detail & Related papers (2023-04-11T00:51:47Z)
Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning [6.5158195776494]
We tackle the controller tuning problem using a novel derivative-free reinforcement learning framework. We conduct numerical experiments on two concrete examples from autonomous driving, namely, adaptive cruise control with PID controller and trajectory tracking with MPC controller. Experimental results show that the proposed method outperforms popular baselines and highlight its strong potential for controller tuning.
arXiv Detail & Related papers (2022-09-11T13:01:14Z)
Unified Automatic Control of Vehicular Systems with Reinforcement Learning [64.63619662693068]
This article contributes a streamlined methodology for vehicular microsimulation. It discovers high performance control strategies with minimal manual design. The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering.
arXiv Detail & Related papers (2022-07-30T16:23:45Z)
Gradient Statistics Aware Power Control for Over-the-Air Federated Learning [59.40860710441232]
Federated learning (FL) is a promising technique that enables many edge devices to train a machine learning model collaboratively in wireless networks. This paper studies the power control problem for over-the-air FL by taking gradient statistics into account.
arXiv Detail & Related papers (2020-03-04T14:06:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.