Deep Reinforcement Learning for Haptic Shared Control in Unknown Tasks
- URL: http://arxiv.org/abs/2101.06227v1
- Date: Fri, 15 Jan 2021 17:27:38 GMT
- Title: Deep Reinforcement Learning for Haptic Shared Control in Unknown Tasks
- Authors: Franklin Carde\~noso Fernandez and Wouter Caarls
- Abstract summary: Haptic shared control (HSC) is an alternative to direct teleoperation in teleoperated systems.
The application of virtual guiding forces decreases the user's control effort and improves execution time in various tasks.
The challenge lies in developing controllers to provide the optimal guiding forces for the tasks that are being performed.
This work addresses this challenge by designing a controller based on the deep deterministic policy gradient (DDPG) algorithm to provide the assistance, and a convolutional neural network (CNN) to perform the task detection.
- Score: 1.0635248457021496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have shown a growing interest in using haptic shared control
(HSC) in teleoperated systems. In HSC, the application of virtual guiding
forces decreases the user's control effort and improves execution time in
various tasks, presenting a good alternative in comparison with direct
teleoperation. HSC, despite demonstrating good performance, opens a new gap:
how to design the guiding forces. For this reason, the challenge lies in
developing controllers to provide the optimal guiding forces for the tasks that
are being performed. This work addresses this challenge by designing a
controller based on the deep deterministic policy gradient (DDPG) algorithm to
provide the assistance, and a convolutional neural network (CNN) to perform the
task detection, called TAHSC (Task Agnostic Haptic Shared Controller). The
agent learns to minimize the time it takes the human to execute the desired
task, while simultaneously minimizing their resistance to the provided
feedback. This resistance thus provides the learning algorithm with information
about which direction the human is trying to follow, in this case, the
pick-and-place task. Diverse results demonstrate the successful application of
the proposed approach by learning custom policies for each user who was asked
to test the system. It exhibits stable convergence and aids the user in
completing the task with the least amount of time possible.
Related papers
- Sample-Efficient Reinforcement Learning with Temporal Logic Objectives: Leveraging the Task Specification to Guide Exploration [13.053013407015628]
This paper addresses the problem of learning optimal control policies for systems with uncertain dynamics.
We propose an accelerated RL algorithm that can learn control policies significantly faster than competitive approaches.
arXiv Detail & Related papers (2024-10-16T00:53:41Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Scaling Learning based Policy Optimization for Temporal Logic Tasks by Controller Network Dropout [4.421486904657393]
We introduce a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear environment.
We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives.
We introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling.
arXiv Detail & Related papers (2024-03-23T12:53:51Z) - Verified Compositional Neuro-Symbolic Control for Stochastic Systems
with Temporal Logic Tasks [11.614036749291216]
Several methods have been proposed recently to learn neural network (NN) controllers for autonomous agents.
A key challenge within these approaches is that they often lack safety guarantees or the provided guarantees are impractical.
This paper aims to address this challenge by checking if there exists a temporal composition of the trained NN controllers.
arXiv Detail & Related papers (2023-11-17T20:51:24Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - CLUTR: Curriculum Learning via Unsupervised Task Representation Learning [130.79246770546413]
CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization.
We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
arXiv Detail & Related papers (2022-10-19T01:45:29Z) - Computation Offloading and Resource Allocation in F-RANs: A Federated
Deep Reinforcement Learning Approach [67.06539298956854]
fog radio access network (F-RAN) is a promising technology in which the user mobile devices (MDs) can offload computation tasks to the nearby fog access points (F-APs)
arXiv Detail & Related papers (2022-06-13T02:19:20Z) - Dealing with Sparse Rewards in Continuous Control Robotics via
Heavy-Tailed Policies [64.2210390071609]
We present a novel Heavy-Tailed Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.
We show consistent performance improvement across all tasks in terms of high average cumulative reward.
arXiv Detail & Related papers (2022-06-12T04:09:39Z) - Accelerated Reinforcement Learning for Temporal Logic Control Objectives [10.216293366496688]
This paper addresses the problem of learning control policies for mobile robots modeled as unknown Markov Decision Processes (MDPs)
We propose a novel accelerated model-based reinforcement learning (RL) algorithm for control objectives that is capable of learning control policies significantly faster than related approaches.
arXiv Detail & Related papers (2022-05-09T17:09:51Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.