The Un-Kidnappable Robot: Acoustic Localization of Sneaking People
- URL: http://arxiv.org/abs/2310.03743v2
- Date: Thu, 9 May 2024 17:59:58 GMT
- Title: The Un-Kidnappable Robot: Acoustic Localization of Sneaking People
- Authors: Mengyu Yang, Patrick Grady, Samarth Brahmbhatt, Arun Balajee Vasudevan, Charles C. Kemp, James Hays,
- Abstract summary: We collect a robotic dataset of high-quality 4-channel audio paired with 360 degree RGB data of people moving in different indoor settings.
We train models that predict if there is a moving person nearby and their location using only audio.
We implement our method on a robot, allowing it to track a single person moving quietly with only passive audio sensing.
- Score: 25.494191141691616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How easy is it to sneak up on a robot? We examine whether we can detect people using only the incidental sounds they produce as they move, even when they try to be quiet. We collect a robotic dataset of high-quality 4-channel audio paired with 360 degree RGB data of people moving in different indoor settings. We train models that predict if there is a moving person nearby and their location using only audio. We implement our method on a robot, allowing it to track a single person moving quietly with only passive audio sensing. For demonstration videos, see our project page: https://sites.google.com/view/unkidnappable-robot
Related papers
- ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation [26.460679530665487]
We propose Audio Noise Awareness using Visuals of Indoors for NAVIgation for quieter robot path planning.
We generate data on how loud an 'impulse' sounds at different listener locations in simulated homes, and train our Acoustic Noise Predictor (ANP)
Unifying ANP with action acoustics, we demonstrate experiments with wheeled (Hello Robot Stretch) and legged (Unitree Go2) robots so that these robots adhere to the noise constraints of the environment.
arXiv Detail & Related papers (2024-10-24T17:19:53Z) - Imitation of human motion achieves natural head movements for humanoid robots in an active-speaker detection task [2.8220015774219567]
Head movements are crucial for social human-human interaction.
In this work, we employed a generative AI pipeline to produce human-like head movements for a Nao humanoid robot.
The results show that the Nao robot successfully imitates human head movements in a natural manner while actively tracking the speakers during the conversation.
arXiv Detail & Related papers (2024-07-16T17:08:40Z) - Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation [65.46610405509338]
We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation.
Our framework,Track2Act predicts tracks of how points in an image should move in future time-steps based on a goal.
We show that this approach of combining scalably learned track prediction with a residual policy enables diverse generalizable robot manipulation.
arXiv Detail & Related papers (2024-05-02T17:56:55Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Learning Video-Conditioned Policies for Unseen Manipulation Tasks [83.2240629060453]
Video-conditioned Policy learning maps human demonstrations of previously unseen tasks to robot manipulation skills.
We learn our policy to generate appropriate actions given current scene observations and a video of the target task.
We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art.
arXiv Detail & Related papers (2023-05-10T16:25:42Z) - Open-World Object Manipulation using Pre-trained Vision-Language Models [72.87306011500084]
For robots to follow instructions from people, they must be able to connect the rich semantic information in human vocabulary.
We develop a simple approach, which leverages a pre-trained vision-language model to extract object-identifying information.
In a variety of experiments on a real mobile manipulator, we find that MOO generalizes zero-shot to a wide range of novel object categories and environments.
arXiv Detail & Related papers (2023-03-02T01:55:10Z) - Human-to-Robot Imitation in the Wild [50.49660984318492]
We propose an efficient one-shot robot learning algorithm, centered around learning from a third-person perspective.
We show one-shot generalization and success in real-world settings, including 20 different manipulation tasks in the wild.
arXiv Detail & Related papers (2022-07-19T17:59:59Z) - Robot Sound Interpretation: Learning Visual-Audio Representations for
Voice-Controlled Robots [0.0]
We learn a representation that associates images and sound commands with minimal supervision.
Using this representation, we generate an intrinsic reward function to learn robotic tasks with reinforcement learning.
We show that our method outperforms previous work across various sound types and robotic tasks empirically.
arXiv Detail & Related papers (2021-09-07T02:26:54Z) - Know Thyself: Transferable Visuomotor Control Through Robot-Awareness [22.405839096833937]
Training visuomotor robot controllers from scratch on a new robot typically requires generating large amounts of robot-specific data.
We propose a "robot-aware" solution paradigm that exploits readily available robot "self-knowledge"
Our experiments on tabletop manipulation tasks in simulation and on real robots demonstrate that these plug-in improvements dramatically boost the transferability of visuomotor controllers.
arXiv Detail & Related papers (2021-07-19T17:56:04Z) - Self-supervised reinforcement learning for speaker localisation with the
iCub humanoid robot [58.2026611111328]
Looking at a person's face is one of the mechanisms that humans rely on when it comes to filtering speech in noisy environments.
Having a robot that can look toward a speaker could benefit ASR performance in challenging environments.
We propose a self-supervised reinforcement learning-based framework inspired by the early development of humans.
arXiv Detail & Related papers (2020-11-12T18:02:15Z) - I can attend a meeting too! Towards a human-like telepresence avatar
robot to attend meeting on your behalf [8.512048419752047]
We focus on a telepresence robot that can be used to attend a meeting remotely with a group of people.
To provide a better meeting experience, the robot should localize the speaker and bring the speaker at the center of the viewing angle.
This article presents a study and implementation of an attention shifting scheme in a telepresence meeting scenario.
arXiv Detail & Related papers (2020-06-28T16:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.