Related papers: Continuous Control with Deep Reinforcement Learning for Autonomous Vessels

Continuous Control with Deep Reinforcement Learning for Autonomous Vessels

URL: http://arxiv.org/abs/2106.14130v1
Date: Sun, 27 Jun 2021 03:12:32 GMT
Title: Continuous Control with Deep Reinforcement Learning for Autonomous Vessels
Authors: Nader Zare and Bruno Brandoli and Mahtab Sarvmaili and Amilcar Soares and Stan Matwin
Abstract summary: We present a new strategy called state-action rotation to improve agent's performance in unseen situations. Experimental results show that the state-action rotation on top of the CVN consistently improves the rate of arrival to a destination.
Score: 8.491129580099757
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Maritime autonomous transportation has played a crucial role in the globalization of the world economy. Deep Reinforcement Learning (DRL) has been applied to automatic path planning to simulate vessel collision avoidance situations in open seas. End-to-end approaches that learn complex mappings directly from the input have poor generalization to reach the targets in different environments. In this work, we present a new strategy called state-action rotation to improve agent's performance in unseen situations by rotating the obtained experience (state-action-state) and preserving them in the replay buffer. We designed our model based on Deep Deterministic Policy Gradient, local view maker, and planner. Our agent uses two deep Convolutional Neural Networks to estimate the policy and action-value functions. The proposed model was exhaustively trained and tested in maritime scenarios with real maps from cities such as Montreal and Halifax. Experimental results show that the state-action rotation on top of the CVN consistently improves the rate of arrival to a destination (RATD) by up 11.96% with respect to the Vessel Navigator with Planner and Local View (VNPLV), as well as it achieves superior performance in unseen mappings by up 30.82%. Our proposed approach exhibits advantages in terms of robustness when tested in a new environment, supporting the idea that generalization can be achieved by using state-action rotation.

Related papers

NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
Hierarchical Reinforcement Learning for Safe Mapless Navigation with Congestion Estimation [7.339743259039457]
This paper introduces a safe mapless navigation framework utilizing hierarchical reinforcement learning (HRL) to enhance navigation through such areas. The findings demonstrate that our HRL-based navigation framework excels in both static and dynamic scenarios. We implement the HRL-based navigation framework on a TurtleBot3 robot for physical validation experiments.
arXiv Detail & Related papers (2025-03-15T08:03:50Z)
FLP-XR: Future Location Prediction on Extreme Scale Maritime Data in Real-time [0.8937169040399775]
This paper introduces FLP-XR, a model that leverages maritime mobility data to construct a robust framework that offers precise predictions. We demonstrate the efficiency of our approach through an extensive experimental study using three real-world AIS datasets.
arXiv Detail & Related papers (2025-03-10T13:31:42Z)
Vision-Based Deep Reinforcement Learning of UAV Autonomous Navigation Using Privileged Information [6.371251946803415]
DPRL is an end-to-end policy designed to address the challenge of high-speed autonomous UAV navigation under partially observable environmental conditions. We leverage an asymmetric Actor-Critic architecture to provide the agent with privileged information during training. We conduct extensive simulations across various scenarios, benchmarking our DPRL algorithm against the state-of-the-art navigation algorithms.
arXiv Detail & Related papers (2024-12-09T09:05:52Z)
Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping [2.9109581496560044]
This paper examines the robustness of benchmark deep reinforcement learning (RL) algorithms, implemented for inland waterway transport (IWT) within an autonomous shipping simulator. We show that a model-free approach can achieve an adequate policy in the simulator, successfully navigating port environments never encountered during training.
arXiv Detail & Related papers (2024-11-07T17:55:07Z)
Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
Motion planners (MPs) are crucial for safe navigation in complex urban environments. nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic. We present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z)
Sequential Modeling of Complex Marine Navigation: Case Study on a Passenger Vessel (Student Abstract) [5.253408036933116]
This paper uses a machine learning approach to explore ways to reduce vessel fuel consumption. We leverage a real-world dataset spanning two years of a ferry in west coast Canada. Our focus centers on the creation of a time series forecasting model. It serves as an evaluative tool to assess the proficiency of the ferry's operation under the captain's guidance.
arXiv Detail & Related papers (2024-03-20T18:29:55Z)
Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning [1.7205106391379026]
We propose a novel method that incorporates state reconstruction feature learning in the recent class of diffusion policies.<n>Our method promotes learning of generalizable state representation to alleviate the distribution shift caused by OOD states.
arXiv Detail & Related papers (2023-07-10T17:34:23Z)
CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation. CCE is based on a novel relationship we provide between gradient estimation and policy entropy. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z)
Robust Path Following on Rivers Using Bootstrapped Reinforcement Learning [0.0]
This paper develops a Deep Reinforcement Learning (DRL)-agent for navigation and control of autonomous surface vessels (ASV) on inland waterways. A state-of-the-art bootstrapped Q-learning algorithm in combination with a versatile training environment generator leads to a robust and accurate rudder controller.
arXiv Detail & Related papers (2023-03-24T07:21:27Z)
Vessel-following model for inland waterways based on deep reinforcement learning [0.0]
This study aims at investigating the feasibility of RL-based vehicle-following for complex vehicle dynamics and strong environmental disturbances. We developed an inland waterways vessel-following model based on realistic vessel dynamics. Our model demonstrated safe and comfortable driving in all scenarios, proving excellent generalization abilities.
arXiv Detail & Related papers (2022-07-07T12:19:03Z)
Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z)
Using Deep Reinforcement Learning with Automatic Curriculum earning for Mapless Navigation in Intralogistics [0.7633618497843278]
We propose a deep reinforcement learning approach for solving a mapless navigation problem in warehouse scenarios. The automatic guided vehicle is equipped with LiDAR and frontal RGB sensors and learns to reach underneath the target dolly. NavACL-Q greatly facilitates the whole learning process and a pre-trained feature extractor manifestly boosts the training speed.
arXiv Detail & Related papers (2022-02-23T13:50:01Z)
A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching. We conduct large scale online A/B tests on DiDi's ride-dispatching platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
Occupancy Anticipation for Efficient Exploration and Navigation [97.17517060585875]
We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions. By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment. Our approach is the winning entry in the 2020 Habitat PointNav Challenge.
arXiv Detail & Related papers (2020-08-21T03:16:51Z)
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling [65.99956848461915]
Vision-and-Language Navigation (VLN) is a task where agents must decide how to move through a 3D environment to reach a goal. One of the problems of the VLN task is data scarcity since it is difficult to collect enough navigation paths with human-annotated instructions for interactive environments. We propose an adversarial-driven counterfactual reasoning model that can consider effective conditions instead of low-quality augmented data.
arXiv Detail & Related papers (2019-11-17T18:02:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.