I can attend a meeting too! Towards a human-like telepresence avatar
robot to attend meeting on your behalf
- URL: http://arxiv.org/abs/2006.15647v1
- Date: Sun, 28 Jun 2020 16:43:04 GMT
- Title: I can attend a meeting too! Towards a human-like telepresence avatar
robot to attend meeting on your behalf
- Authors: Hrishav Bakul Barua, Chayan Sarkar, Achanna Anil Kumar, Arpan Pal,
Balamuralidhar P
- Abstract summary: We focus on a telepresence robot that can be used to attend a meeting remotely with a group of people.
To provide a better meeting experience, the robot should localize the speaker and bring the speaker at the center of the viewing angle.
This article presents a study and implementation of an attention shifting scheme in a telepresence meeting scenario.
- Score: 8.512048419752047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Telepresence robots are used in various forms in various use-cases that helps
to avoid physical human presence at the scene of action. In this work, we focus
on a telepresence robot that can be used to attend a meeting remotely with a
group of people. Unlike a one-to-one meeting, participants in a group meeting
can be located at a different part of the room, especially in an informal
setup. As a result, all of them may not be at the viewing angle of the robot,
a.k.a. the remote participant. In such a case, to provide a better meeting
experience, the robot should localize the speaker and bring the speaker at the
center of the viewing angle. Though sound source localization can easily be
done using a microphone-array, bringing the speaker or set of speakers at the
viewing angle is not a trivial task. First of all, the robot should react only
to a human voice, but not to the random noises. Secondly, if there are multiple
speakers, to whom the robot should face or should it rotate continuously with
every new speaker? Lastly, most robotic platforms are resource-constrained and
to achieve a real-time response, i.e., avoiding network delay, all the
algorithms should be implemented within the robot itself. This article presents
a study and implementation of an attention shifting scheme in a telepresence
meeting scenario which best suits the needs and expectations of the collocated
and remote attendees. We define a policy to decide when a robot should rotate
and how much based on real-time speaker localization. Using user satisfaction
study, we show the efficacy and usability of our system in the meeting
scenario. Moreover, our system can be easily adapted to other scenarios where
multiple people are located.
Related papers
- Imitation of human motion achieves natural head movements for humanoid robots in an active-speaker detection task [2.8220015774219567]
Head movements are crucial for social human-human interaction.
In this work, we employed a generative AI pipeline to produce human-like head movements for a Nao humanoid robot.
The results show that the Nao robot successfully imitates human head movements in a natural manner while actively tracking the speakers during the conversation.
arXiv Detail & Related papers (2024-07-16T17:08:40Z) - Unifying 3D Representation and Control of Diverse Robots with a Single Camera [48.279199537720714]
We introduce Neural Jacobian Fields, an architecture that autonomously learns to model and control robots from vision alone.
Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot.
arXiv Detail & Related papers (2024-07-11T17:55:49Z) - Ain't Misbehavin' -- Using LLMs to Generate Expressive Robot Behavior in
Conversations with the Tabletop Robot Haru [9.2526849536751]
We introduce a fully-automated conversation system that leverages large language models (LLMs) to generate robot responses with expressive behaviors.
We conduct a pilot study where volunteers chat with a social robot using our proposed system, and we analyze their feedback, conducting a rigorous error analysis of chat transcripts.
Most negative feedback was due to automatic speech recognition (ASR) errors which had limited impact on conversations.
arXiv Detail & Related papers (2024-02-18T12:35:52Z) - The Un-Kidnappable Robot: Acoustic Localization of Sneaking People [25.494191141691616]
We collect a robotic dataset of high-quality 4-channel audio paired with 360 degree RGB data of people moving in different indoor settings.
We train models that predict if there is a moving person nearby and their location using only audio.
We implement our method on a robot, allowing it to track a single person moving quietly with only passive audio sensing.
arXiv Detail & Related papers (2023-10-05T17:59:55Z) - ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space [9.806227900768926]
This paper introduces a novel deep-learning approach for human-to-robot motion.
Our method does not require paired human-to-robot data, which facilitates its translation to new robots.
Our model outperforms existing works regarding human-to-robot similarity in terms of efficiency and precision.
arXiv Detail & Related papers (2023-09-11T08:55:04Z) - WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model [92.90127398282209]
This paper investigates the potential of integrating the most recent Large Language Models (LLMs) and existing visual grounding and robotic grasping system.
We introduce the WALL-E (Embodied Robotic WAiter load lifting with Large Language model) as an example of this integration.
We deploy this LLM-empowered system on the physical robot to provide a more user-friendly interface for the instruction-guided grasping task.
arXiv Detail & Related papers (2023-08-30T11:35:21Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Robots with Different Embodiments Can Express and Influence Carefulness
in Object Manipulation [104.5440430194206]
This work investigates the perception of object manipulations performed with a communicative intent by two robots.
We designed the robots' movements to communicate carefulness or not during the transportation of objects.
arXiv Detail & Related papers (2022-08-03T13:26:52Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z) - Self-supervised reinforcement learning for speaker localisation with the
iCub humanoid robot [58.2026611111328]
Looking at a person's face is one of the mechanisms that humans rely on when it comes to filtering speech in noisy environments.
Having a robot that can look toward a speaker could benefit ASR performance in challenging environments.
We propose a self-supervised reinforcement learning-based framework inspired by the early development of humans.
arXiv Detail & Related papers (2020-11-12T18:02:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.