Related papers: Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

URL: http://arxiv.org/abs/2309.03839v1
Date: Thu, 7 Sep 2023 16:52:27 GMT
Title: Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning
Authors: Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine
Abstract summary: Adaptive interfaces can help users perform sequential decision-making tasks. Recent advances in human-in-the-loop machine learning enable such systems to improve by interacting with users. We propose a reinforcement learning algorithm to train an interface to map raw command signals to actions.
Score: 82.91837418721182
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adaptive interfaces can help users perform sequential decision-making tasks like robotic teleoperation given noisy, high-dimensional command signals (e.g., from a brain-computer interface). Recent advances in human-in-the-loop machine learning enable such systems to improve by interacting with users, but tend to be limited by the amount of data that they can collect from individual users in practice. In this paper, we propose a reinforcement learning algorithm to address this by training an interface to map raw command signals to actions using a combination of offline pre-training and online fine-tuning. To address the challenges posed by noisy command signals and sparse rewards, we develop a novel method for representing and inferring the user's long-term intent for a given trajectory. We primarily evaluate our method's ability to assist users who can only communicate through noisy, high-dimensional input channels through a user study in which 12 participants performed a simulated navigation task by using their eye gaze to modulate a 128-dimensional command signal from their webcam. The results show that our method enables successful goal navigation more often than a baseline directional interface, by learning to denoise user commands signals and provide shared autonomy assistance. We further evaluate on a simulated Sawyer pushing task with eye gaze control, and the Lunar Lander game with simulated user commands, and find that our method improves over baseline interfaces in these domains as well. Extensive ablation experiments with simulated user commands empirically motivate each component of our method.

Related papers

Online Context Learning for Socially Compliant Navigation [49.609656402450746]
This letter introduces an online context learning method that aims to empower robots to adapt to new social environments online. Experiments using a community-wide simulator show that our method outperforms the state-of-the-art ones.
arXiv Detail & Related papers (2024-06-17T12:59:13Z)
I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data [4.487146086221174]
We present a novel human-centered learning algorithm designed for automated object recognition within mobile eye-tracking settings. Our approach seamlessly integrates an object detector with a spatial relation-aware inductive message-passing network (I-MPN), harnessing node profile information and capturing object correlations.
arXiv Detail & Related papers (2024-06-10T13:08:31Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
SimCURL: Simple Contrastive User Representation Learning from Command Sequences [22.92215383896495]
We propose SimCURL, a contrastive self-supervised deep learning framework that learns user representation from unlabeled command sequences. We train and evaluate our method on a real-world command sequence dataset of more than half a billion commands.
arXiv Detail & Related papers (2022-07-29T16:06:03Z)
First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization [112.40598205054994]
We formalize this idea as a completely unsupervised objective for optimizing interfaces. We conduct an observational study on 540K examples of users operating various keyboard and eye gaze interfaces for typing, controlling simulated robots, and playing video games. The results show that our mutual information scores are predictive of the ground-truth task completion metrics in a variety of domains.
arXiv Detail & Related papers (2022-05-24T21:57:18Z)
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback [83.95599156217945]
We focus on assistive typing applications in which a user cannot operate a keyboard, but can supply other inputs. Standard methods train a model on a fixed dataset of user inputs, then deploy a static interface that does not learn from its mistakes. We investigate a simple idea that would enable such interfaces to improve over time, with minimal additional effort from the user.
arXiv Detail & Related papers (2022-03-04T00:07:20Z)
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning [91.58711082348293]
Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem. This approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse. We propose a hierarchical solution that learns efficiently from sparse user feedback.
arXiv Detail & Related papers (2022-02-05T02:01:19Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.