Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions
- URL: http://arxiv.org/abs/2503.00527v1
- Date: Sat, 01 Mar 2025 15:01:50 GMT
- Title: Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions
- Authors: Guanwen Xie, Jingzehua Xu, Yimian Ding, Zhi Zhang, Shuai Zhang, Yi Li,
- Abstract summary: Large language model (LLM)-enhanced reinforcement learning (RL)-based adaptive S-surface controller for AUVs.<n>Using multi-modal and structured explicit task feedback, LLMs enable joint adjustments, balance multiple objectives, and enhance task-oriented performance and adaptability.<n>In the proposed controller, the RL policy focuses on upper-level tasks, outputting task-oriented high-level commands that the S-surface controller then converts into control signals.
- Score: 9.713618537140587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The adaptivity and maneuvering capabilities of Autonomous Underwater Vehicles (AUVs) have drawn significant attention in oceanic research, due to the unpredictable disturbances and strong coupling among the AUV's degrees of freedom. In this paper, we developed large language model (LLM)-enhanced reinforcement learning (RL)-based adaptive S-surface controller for AUVs. Specifically, LLMs are introduced for the joint optimization of controller parameters and reward functions in RL training. Using multi-modal and structured explicit task feedback, LLMs enable joint adjustments, balance multiple objectives, and enhance task-oriented performance and adaptability. In the proposed controller, the RL policy focuses on upper-level tasks, outputting task-oriented high-level commands that the S-surface controller then converts into control signals, ensuring cancellation of nonlinear effects and unpredictable external disturbances in extreme sea conditions. Under extreme sea conditions involving complex terrain, waves, and currents, the proposed controller demonstrates superior performance and adaptability in high-level tasks such as underwater target tracking and data collection, outperforming traditional PID and SMC controllers.
Related papers
- Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach [20.36806314683902]
We study an integrated sensing and communications (ISAC) system for low-altitude economy (LAE)<n>The expected communication sum-rate over a given flight period is maximized by jointly optimizing the beamforming at the GBS and UAVs' trajectories.<n>We propose a novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), by leveraging the deep reinforcement learning (DRL) technique.
arXiv Detail & Related papers (2024-12-05T11:12:46Z) - Function Approximation for Reinforcement Learning Controller for Energy from Spread Waves [69.9104427437916]
Multi-generator Wave Energy Converters (WEC) must handle multiple simultaneous waves coming from different directions called spread waves.
These complex devices need controllers with multiple objectives of energy capture efficiency, reduction of structural stress to limit maintenance, and proactive protection against high waves.
In this paper, we explore different function approximations for the policy and critic networks in modeling the sequential nature of the system dynamics.
arXiv Detail & Related papers (2024-04-17T02:04:10Z) - Sim-to-Real Transfer of Adaptive Control Parameters for AUV
Stabilization under Current Disturbance [1.099532646524593]
This paper presents a novel approach, merging the Maximum Entropy Deep Reinforcement Learning framework with a classic model-based control architecture, to formulate an adaptive controller.
Within this framework, we introduce a Sim-to-Real transfer strategy comprising the following components: a bio-inspired experience replay mechanism, an enhanced domain randomisation technique, and an evaluation protocol executed on a physical platform.
Our experimental assessments demonstrate that this method effectively learns proficient policies from suboptimal simulated models of the AUV, resulting in control performance 3 times higher when transferred to a real-world vehicle.
arXiv Detail & Related papers (2023-10-17T08:46:56Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Distributed Neurodynamics-Based Backstepping Optimal Control for Robust
Constrained Consensus of Underactuated Underwater Vehicles Fleet [16.17376845767656]
This paper develops a novel consensus based optimal coordination protocol and a robust controller.
The optimal formation tracking of UUVs fleet can be achieved, and the constraints are fulfilled.
The stability of the overall UUVs formation system is established to ensure that all the states of the UUVs are uniformly ultimately bounded in the presence of unknown disturbances.
arXiv Detail & Related papers (2023-08-18T06:04:12Z) - Enhancing AUV Autonomy With Model Predictive Path Integral Control [9.800697959791544]
We investigate the feasibility of Model Predictive Path Integral Control (MPPI) for the control of an AUV.
We utilise a non-linear model of the AUV to propagate the samples of the MPPI, which allow us to compute the control action in real time.
arXiv Detail & Related papers (2023-08-10T12:55:57Z) - Direct Preference Optimization: Your Language Model is Secretly a Reward Model [119.65409513119963]
We introduce a new parameterization of the reward model in RLHF that enables extraction of the corresponding optimal policy in closed form.
The resulting algorithm, which we call Direct Preference Optimization (DPO), is stable, performant, and computationally lightweight.
Our experiments show that DPO can fine-tune LMs to align with human preferences as well as or better than existing methods.
arXiv Detail & Related papers (2023-05-29T17:57:46Z) - Real-Time Model-Free Deep Reinforcement Learning for Force Control of a
Series Elastic Actuator [56.11574814802912]
State-of-the art robotic applications utilize series elastic actuators (SEAs) with closed-loop force control to achieve complex tasks such as walking, lifting, and manipulation.
Model-free PID control methods are more prone to instability due to nonlinearities in the SEA.
Deep reinforcement learning has proved to be an effective model-free method for continuous control tasks.
arXiv Detail & Related papers (2023-04-11T00:51:47Z) - Skip Training for Multi-Agent Reinforcement Learning Controller for
Industrial Wave Energy Converters [94.84709449845352]
Recent Wave Energy Converters (WEC) are equipped with multiple legs and generators to maximize energy generation.
Traditional controllers have shown limitations to capture complex wave patterns and the controllers must efficiently maximize the energy capture.
This paper introduces a Multi-Agent Reinforcement Learning controller (MARL), which outperforms the traditionally used spring damper controller.
arXiv Detail & Related papers (2022-09-13T00:20:31Z) - Data-driven controllers and the need for perception systems in
underwater manipulation [4.060731229044571]
The modeling of UVMSs is a complicated and costly process due to the highly nonlinear dynamics.
This is aggravated in tasks where the manipulation of objects is necessary.
We introduce a control strategy for UVMSs working with unknown payloads.
arXiv Detail & Related papers (2021-09-21T17:25:10Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.