Measuring Acoustics with Collaborative Multiple Agents
- URL: http://arxiv.org/abs/2310.05368v1
- Date: Mon, 9 Oct 2023 02:58:27 GMT
- Title: Measuring Acoustics with Collaborative Multiple Agents
- Authors: Yinfeng Yu, Changan Chen, Lele Cao, Fangkai Yang, Fuchun Sun
- Abstract summary: Two robots are trained to explore the environment's acoustics while being rewarded for wide exploration and accurate prediction.
We show that the robots learn to collaborate and move to explore environment acoustics while minimizing the prediction error.
- Score: 25.879534979760034
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As humans, we hear sound every second of our life. The sound we hear is often
affected by the acoustics of the environment surrounding us. For example, a
spacious hall leads to more reverberation. Room Impulse Responses (RIR) are
commonly used to characterize environment acoustics as a function of the scene
geometry, materials, and source/receiver locations. Traditionally, RIRs are
measured by setting up a loudspeaker and microphone in the environment for all
source/receiver locations, which is time-consuming and inefficient. We propose
to let two robots measure the environment's acoustics by actively moving and
emitting/receiving sweep signals. We also devise a collaborative multi-agent
policy where these two robots are trained to explore the environment's
acoustics while being rewarded for wide exploration and accurate prediction. We
show that the robots learn to collaborate and move to explore environment
acoustics while minimizing the prediction error. To the best of our knowledge,
we present the very first problem formulation and solution to the task of
collaborative environment acoustics measurements with multiple agents.
Related papers
- ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation [26.460679530665487]
We propose Audio Noise Awareness using Visuals of Indoors for NAVIgation for quieter robot path planning.
We generate data on how loud an 'impulse' sounds at different listener locations in simulated homes, and train our Acoustic Noise Predictor (ANP)
Unifying ANP with action acoustics, we demonstrate experiments with wheeled (Hello Robot Stretch) and legged (Unitree Go2) robots so that these robots adhere to the noise constraints of the environment.
arXiv Detail & Related papers (2024-10-24T17:19:53Z) - Blind Spatial Impulse Response Generation from Separate Room- and Scene-Specific Information [0.42970700836450487]
knowledge of the users' real acoustic environment is crucial for rendering virtual sounds that seamlessly blend into the environment.
We show how both room- and position-specific parameters are considered in the final output.
arXiv Detail & Related papers (2024-09-23T12:41:31Z) - ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling [57.1025908604556]
An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment.
We propose active acoustic sampling, a new task for efficiently building an environment acoustic model of an unmapped environment.
We introduce ActiveRIR, a reinforcement learning policy that leverages information from audio-visual sensor streams to guide agent navigation and determine optimal acoustic data sampling positions.
arXiv Detail & Related papers (2024-04-24T21:30:01Z) - Neural Acoustic Context Field: Rendering Realistic Room Impulse Response
With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene.
Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion.
Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z) - Self-Supervised Visual Acoustic Matching [63.492168778869726]
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a target acoustic environment.
We propose a self-supervised approach to visual acoustic matching where training samples include only the target scene image and audio.
Our approach jointly learns to disentangle room acoustics and re-synthesize audio into the target environment, via a conditional GAN framework and a novel metric.
arXiv Detail & Related papers (2023-07-27T17:59:59Z) - End-to-End Binaural Speech Synthesis [71.1869877389535]
We present an end-to-end speech synthesis system that combines a low-bitrate audio system with a powerful decoder.
We demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
arXiv Detail & Related papers (2022-07-08T05:18:36Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - A Deep Reinforcement Learning Approach for Audio-based Navigation and
Audio Source Localization in Multi-speaker Environments [1.0527821704930371]
In this work we apply deep reinforcement learning to the problems of navigating a three-dimensional environment and inferring the locations of human speaker audio sources within.
We create two virtual environments using the Unity game engine, one presenting an audio-based navigation problem and one presenting an audio source localization problem.
We also create an autonomous agent based on PPO online reinforcement learning algorithm and attempt to train it to solve these environments.
arXiv Detail & Related papers (2021-10-25T10:18:34Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.