Accelerating the Convergence of Human-in-the-Loop Reinforcement Learning
with Counterfactual Explanations
- URL: http://arxiv.org/abs/2108.01358v1
- Date: Tue, 3 Aug 2021 08:27:28 GMT
- Title: Accelerating the Convergence of Human-in-the-Loop Reinforcement Learning
with Counterfactual Explanations
- Authors: Jakob Karalus, Felix Lindner
- Abstract summary: Human-in-the-loop Reinforcement Learning (HRL) addresses this issue by combining human feedback and reinforcement learning techniques.
We extend the existing TAMER Framework with the possibility to enhance human feedback with two different types of counterfactual explanations.
- Score: 1.8275108630751844
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The capability to interactively learn from human feedback would enable robots
in new social settings. For example, novice users could train service robots in
new tasks naturally and interactively. Human-in-the-loop Reinforcement Learning
(HRL) addresses this issue by combining human feedback and reinforcement
learning (RL) techniques. State-of-the-art interactive learning techniques
suffer from slow convergence, thus leading to a frustrating experience for the
human. This work approaches this problem by extending the existing TAMER
Framework with the possibility to enhance human feedback with two different
types of counterfactual explanations. We demonstrate our extensions' success in
improving the convergence, especially in the crucial early phases of the
training.
Related papers
- Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework [13.949126295663328]
We bridge the gap between machine learning and human-computer interaction efforts by developing a shared understanding of human feedback in interactive learning scenarios.
We introduce a taxonomy of feedback types for reward-based learning from human feedback based on nine key dimensions.
We identify seven quality metrics of human feedback influencing both the human ability to express feedback and the agent's ability to learn from the feedback.
arXiv Detail & Related papers (2024-11-18T17:40:42Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Human Decision Makings on Curriculum Reinforcement Learning with
Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process.
Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications.
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - Towards Interactive Reinforcement Learning with Intrinsic Feedback [1.7117805951258132]
Reinforcement learning (RL) and brain-computer interfaces (BCI) have experienced significant growth over the past decade.
With rising interest in human-in-the-loop (HITL), incorporating human input with RL algorithms has given rise to the sub-field of interactive RL.
We denote this new and emerging medium of feedback as intrinsic feedback.
arXiv Detail & Related papers (2021-12-02T19:29:26Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Deep Reinforcement Learning with Interactive Feedback in a Human-Robot
Environment [1.2998475032187096]
We propose a deep reinforcement learning approach with interactive feedback to learn a domestic task in a human-robot scenario.
We compare three different learning methods using a simulated robotic arm for the task of organizing different objects.
The obtained results show that a learner agent, using either agent-IDeepRL or human-IDeepRL, completes the given task earlier and has fewer mistakes compared to the autonomous DeepRL approach.
arXiv Detail & Related papers (2020-07-07T11:55:27Z) - Accelerating Reinforcement Learning Agent with EEG-based Implicit Human
Feedback [10.138798960466222]
Reinforcement Learning (RL) agents with human feedback can dramatically improve various aspects of learning.
Previous methods require human observer to give inputs explicitly, burdening the human in the loop of RL agent's learning process.
We investigate capturing human's intrinsic reactions as implicit (and natural) feedback through EEG in the form of error-related potentials (ErrP)
arXiv Detail & Related papers (2020-06-30T03:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.