First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual
Information Maximization
- URL: http://arxiv.org/abs/2205.12381v1
- Date: Tue, 24 May 2022 21:57:18 GMT
- Title: First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual
Information Maximization
- Authors: Siddharth Reddy, Sergey Levine, Anca D. Dragan
- Abstract summary: We formalize this idea as a completely unsupervised objective for optimizing interfaces.
We conduct an observational study on 540K examples of users operating various keyboard and eye gaze interfaces for typing, controlling simulated robots, and playing video games.
The results show that our mutual information scores are predictive of the ground-truth task completion metrics in a variety of domains.
- Score: 112.40598205054994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How can we train an assistive human-machine interface (e.g., an
electromyography-based limb prosthesis) to translate a user's raw command
signals into the actions of a robot or computer when there is no prior mapping,
we cannot ask the user for supervision in the form of action labels or reward
feedback, and we do not have prior knowledge of the tasks the user is trying to
accomplish? The key idea in this paper is that, regardless of the task, when an
interface is more intuitive, the user's commands are less noisy. We formalize
this idea as a completely unsupervised objective for optimizing interfaces: the
mutual information between the user's command signals and the induced state
transitions in the environment. To evaluate whether this mutual information
score can distinguish between effective and ineffective interfaces, we conduct
an observational study on 540K examples of users operating various keyboard and
eye gaze interfaces for typing, controlling simulated robots, and playing video
games. The results show that our mutual information scores are predictive of
the ground-truth task completion metrics in a variety of domains, with an
average Spearman's rank correlation of 0.43. In addition to offline evaluation
of existing interfaces, we use our unsupervised objective to learn an interface
from scratch: we randomly initialize the interface, have the user attempt to
perform their desired tasks using the interface, measure the mutual information
score, and update the interface to maximize mutual information through
reinforcement learning. We evaluate our method through a user study with 12
participants who perform a 2D cursor control task using a perturbed mouse, and
an experiment with one user playing the Lunar Lander game using hand gestures.
The results show that we can learn an interface from scratch, without any user
supervision or prior knowledge of tasks, in under 30 minutes.
Related papers
- Identifying User Goals from UI Trajectories [19.492331502146886]
This paper introduces the task of goal identification from observed UI trajectories.
We propose a novel evaluation metric to assess whether two task descriptions are paraphrased within a specific UI environment.
Using our metric and these datasets, we conducted several experiments comparing the performance of humans and state-of-the-art models.
arXiv Detail & Related papers (2024-06-20T13:46:10Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Bootstrapping Adaptive Human-Machine Interfaces with Offline
Reinforcement Learning [82.91837418721182]
Adaptive interfaces can help users perform sequential decision-making tasks.
Recent advances in human-in-the-loop machine learning enable such systems to improve by interacting with users.
We propose a reinforcement learning algorithm to train an interface to map raw command signals to actions.
arXiv Detail & Related papers (2023-09-07T16:52:27Z) - MUG: Interactive Multimodal Grounding on User Interfaces [12.035123646959669]
We present MUG, a novel interactive task for multimodal grounding where a user and an agent work collaboratively on an interface screen.
Prior works modeled multimodal UI grounding in one round: the user gives a command and the agent responds to the command. MUG allows multiple rounds of interactions such that upon seeing the agent responses, the user can give further commands for the agent to refine or even correct its actions.
arXiv Detail & Related papers (2022-09-29T21:08:18Z) - X2T: Training an X-to-Text Typing Interface with Online Learning from
User Feedback [83.95599156217945]
We focus on assistive typing applications in which a user cannot operate a keyboard, but can supply other inputs.
Standard methods train a model on a fixed dataset of user inputs, then deploy a static interface that does not learn from its mistakes.
We investigate a simple idea that would enable such interfaces to improve over time, with minimal additional effort from the user.
arXiv Detail & Related papers (2022-03-04T00:07:20Z) - ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement
Learning [91.58711082348293]
Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem.
This approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse.
We propose a hierarchical solution that learns efficiently from sparse user feedback.
arXiv Detail & Related papers (2022-02-05T02:01:19Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.