Learning Visualization Policies of Augmented Reality for Human-Robot
Collaboration
- URL: http://arxiv.org/abs/2211.07028v1
- Date: Sun, 13 Nov 2022 22:03:20 GMT
- Title: Learning Visualization Policies of Augmented Reality for Human-Robot
Collaboration
- Authors: Kishan Chandan, Jack Albertson, Shiqi Zhang
- Abstract summary: In human-robot collaboration domains, augmented reality (AR) technologies have enabled people to visualize the state of robots.
Current AR-based visualization policies are designed manually, which requires a lot of human efforts and domain knowledge.
We develop a framework, called VARIL, that enables AR agents to learn visualization policies from demonstrations.
- Score: 5.400491728405083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In human-robot collaboration domains, augmented reality (AR) technologies
have enabled people to visualize the state of robots. Current AR-based
visualization policies are designed manually, which requires a lot of human
efforts and domain knowledge. When too little information is visualized, human
users find the AR interface not useful; when too much information is
visualized, they find it difficult to process the visualized information. In
this paper, we develop a framework, called VARIL, that enables AR agents to
learn visualization policies (what to visualize, when, and how) from
demonstrations. We created a Unity-based platform for simulating warehouse
environments where human-robot teammates collaborate on delivery tasks. We have
collected a dataset that includes demonstrations of visualizing robots' current
and planned behaviors. Results from experiments with real human participants
show that, compared with competitive baselines from the literature, our learned
visualization strategies significantly increase the efficiency of human-robot
teams, while reducing the distraction level of human users. VARIL has been
demonstrated in a built-in-lab mock warehouse.
Related papers
- ARCap: Collecting High-quality Human Demonstrations for Robot Learning with Augmented Reality Feedback [21.9704438641606]
We propose ARCap, a portable data collection system that provides visual feedback through augmented reality (AR) and haptic warnings to guide users in collecting high-quality demonstrations.
With data collected from ARCap, robots can perform challenging tasks, such as manipulation in cluttered environments and long-horizon cross-embodiment manipulation.
arXiv Detail & Related papers (2024-10-11T02:30:46Z) - VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections [10.49712834719005]
We propose a low-cost visual teleoperation system for bimanual manipulation tasks, called VITAL.
Our approach leverages affordable hardware and visual processing techniques to collect demonstrations.
We enhance the generalizability and robustness of the learned policies by utilizing both real and simulated environments.
arXiv Detail & Related papers (2024-07-30T23:29:47Z) - Improving Visual Perception of a Social Robot for Controlled and
In-the-wild Human-robot Interaction [10.260966795508569]
It is unclear how will the objective interaction performance and subjective user experience be influenced when a social robot adopts a deep-learning based visual perception model.
We employ state-of-the-art human perception and tracking models to improve the visual perception function of the Pepper robot.
arXiv Detail & Related papers (2024-03-04T06:47:06Z) - Voila-A: Aligning Vision-Language Models with User's Gaze Attention [56.755993500556734]
We introduce gaze information as a proxy for human attention to guide Vision-Language Models (VLMs)
We propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
arXiv Detail & Related papers (2023-12-22T17:34:01Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
One-Shot [56.130215236125224]
A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.
Recent research in one-shot imitation learning has shown promise in transferring trained policies to new tasks based on demonstrations.
This paper aims to unlock the potential for an agent to generalize to hundreds of real-world skills with multi-modal perception.
arXiv Detail & Related papers (2023-07-02T15:33:31Z) - Affordances from Human Videos as a Versatile Representation for Robotics [31.248842798600606]
We train a visual affordance model that estimates where and how in the scene a human is likely to interact.
The structure of these behavioral affordances directly enables the robot to perform many complex tasks.
We show the efficacy of our approach, which we call VRB, across 4 real world environments, over 10 different tasks, and 2 robotic platforms operating in the wild.
arXiv Detail & Related papers (2023-04-17T17:59:34Z) - An Augmented Reality Platform for Introducing Reinforcement Learning to
K-12 Students with Robots [10.835598738100359]
We propose an Augmented Reality (AR) system that reveals the hidden state of the learning to the human users.
This paper describes our system's design and implementation and concludes with a discussion on two directions for future work.
arXiv Detail & Related papers (2021-10-10T03:51:39Z) - Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.