Related papers: Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback

Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback

URL: http://arxiv.org/abs/2507.13171v1
Date: Thu, 17 Jul 2025 14:35:12 GMT
Title: Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
Authors: Suzie Kim, Hye-Bin Shin, Seong-Whan Lee,
Abstract summary: We propose a novel reinforcement learning from implicit human feedback (RLIHF) framework that utilizes non-invasive electroencephalography (EEG) signals.<n>We evaluate our approach in a simulation environment built on the MuJoCo physics engine, using a Kinova Gen2 robotic arm.<n>Results show that agents trained with decoded EEG feedback achieve performance comparable to those trained with dense, manually designed rewards.
Score: 26.585985828583304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conventional reinforcement learning (RL) ap proaches often struggle to learn effective policies under sparse reward conditions, necessitating the manual design of complex, task-specific reward functions. To address this limitation, rein forcement learning from human feedback (RLHF) has emerged as a promising strategy that complements hand-crafted rewards with human-derived evaluation signals. However, most existing RLHF methods depend on explicit feedback mechanisms such as button presses or preference labels, which disrupt the natural interaction process and impose a substantial cognitive load on the user. We propose a novel reinforcement learning from implicit human feedback (RLIHF) framework that utilizes non-invasive electroencephalography (EEG) signals, specifically error-related potentials (ErrPs), to provide continuous, implicit feedback without requiring explicit user intervention. The proposed method adopts a pre-trained decoder to transform raw EEG signals into probabilistic reward components, en abling effective policy learning even in the presence of sparse external rewards. We evaluate our approach in a simulation environment built on the MuJoCo physics engine, using a Kinova Gen2 robotic arm to perform a complex pick-and-place task that requires avoiding obstacles while manipulating target objects. The results show that agents trained with decoded EEG feedback achieve performance comparable to those trained with dense, manually designed rewards. These findings validate the potential of using implicit neural feedback for scalable and human-aligned reinforcement learning in interactive robotics.

Related papers

Mapping Neural Signals to Agent Performance, A Step Towards Reinforcement Learning from Neural Feedback [2.9060647847644985]
We introduce NEURO-LOOP, an implicit feedback framework that utilizes the intrinsic human reward system to drive human-agent interaction.<n>This work demonstrates the feasibility of a critical first step in the NEURO-LOOP framework: mapping brain signals to agent performance.<n>We conclude that a relationship between fNIRS data and agent performance exists using classical machine learning techniques.
arXiv Detail & Related papers (2025-06-14T21:38:31Z)
RILe: Reinforced Imitation Learning [60.63173816209543]
RILe (Reinforced Learning) is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently.<n>Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning. The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping. Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z)
Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems [79.07468367923619]
We propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC) We design an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards. We show that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
arXiv Detail & Related papers (2022-09-19T16:49:32Z)
Backprop-Free Reinforcement Learning with Active Neural Generative Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Accelerating Reinforcement Learning Agent with EEG-based Implicit Human Feedback [10.138798960466222]
Reinforcement Learning (RL) agents with human feedback can dramatically improve various aspects of learning. Previous methods require human observer to give inputs explicitly, burdening the human in the loop of RL agent's learning process. We investigate capturing human's intrinsic reactions as implicit (and natural) feedback through EEG in the form of error-related potentials (ErrP)
arXiv Detail & Related papers (2020-06-30T03:13:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.