SoundCam: A Dataset for Finding Humans Using Room Acoustics
- URL: http://arxiv.org/abs/2311.03517v2
- Date: Mon, 15 Jan 2024 08:15:52 GMT
- Title: SoundCam: A Dataset for Finding Humans Using Room Acoustics
- Authors: Mason Wang, Samuel Clarke, Jui-Hsien Wang, Ruohan Gao, Jiajun Wu
- Abstract summary: We present SoundCam, the largest dataset of unique RIRs from in-the-wild rooms publicly released to date.
It includes 5,000 10-channel real-world measurements of room impulse responses and 2,000 10-channel recordings of music in three different rooms.
We show that these measurements can be used for interesting tasks, such as detecting and identifying humans, and tracking their positions.
- Score: 22.279282163908462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A room's acoustic properties are a product of the room's geometry, the
objects within the room, and their specific positions. A room's acoustic
properties can be characterized by its impulse response (RIR) between a source
and listener location, or roughly inferred from recordings of natural signals
present in the room. Variations in the positions of objects in a room can
effect measurable changes in the room's acoustic properties, as characterized
by the RIR. Existing datasets of RIRs either do not systematically vary
positions of objects in an environment, or they consist of only simulated RIRs.
We present SoundCam, the largest dataset of unique RIRs from in-the-wild rooms
publicly released to date. It includes 5,000 10-channel real-world measurements
of room impulse responses and 2,000 10-channel recordings of music in three
different rooms, including a controlled acoustic lab, an in-the-wild living
room, and a conference room, with different humans in positions throughout each
room. We show that these measurements can be used for interesting tasks, such
as detecting and identifying humans, and tracking their positions.
Related papers
- Blind Spatial Impulse Response Generation from Separate Room- and Scene-Specific Information [0.42970700836450487]
knowledge of the users' real acoustic environment is crucial for rendering virtual sounds that seamlessly blend into the environment.
We show how both room- and position-specific parameters are considered in the final output.
arXiv Detail & Related papers (2024-09-23T12:41:31Z) - SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field [39.19609821736598]
SPEAR is a continuous receiver-to-receiver acoustic neural warping field for spatial acoustic effects prediction.
We show SPEAR superiority on both synthetic, photo-realistic and real-world dataset.
arXiv Detail & Related papers (2024-06-16T16:40:26Z) - Hearing Anything Anywhere [26.415266601469767]
We introduce DiffRIR, a differentiable RIR rendering framework with interpretable parametric models of salient acoustic features of the scene.
This allows us to synthesize novel auditory experiences through the space with any source audio.
We show that our model outperforms state-ofthe-art baselines on rendering monaural and RIRs and music at unseen locations.
arXiv Detail & Related papers (2024-06-11T17:56:14Z) - ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling [57.1025908604556]
An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment.
We propose active acoustic sampling, a new task for efficiently building an environment acoustic model of an unmapped environment.
We introduce ActiveRIR, a reinforcement learning policy that leverages information from audio-visual sensor streams to guide agent navigation and determine optimal acoustic data sampling positions.
arXiv Detail & Related papers (2024-04-24T21:30:01Z) - Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities.
RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z) - Measuring Acoustics with Collaborative Multiple Agents [25.879534979760034]
Two robots are trained to explore the environment's acoustics while being rewarded for wide exploration and accurate prediction.
We show that the robots learn to collaborate and move to explore environment acoustics while minimizing the prediction error.
arXiv Detail & Related papers (2023-10-09T02:58:27Z) - Self-Supervised Visual Acoustic Matching [63.492168778869726]
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a target acoustic environment.
We propose a self-supervised approach to visual acoustic matching where training samples include only the target scene image and audio.
Our approach jointly learns to disentangle room acoustics and re-synthesize audio into the target environment, via a conditional GAN framework and a novel metric.
arXiv Detail & Related papers (2023-07-27T17:59:59Z) - Listen2Scene: Interactive material-aware binaural sound propagation for
reconstructed 3D scenes [69.03289331433874]
We present an end-to-end audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications.
We propose a novel neural-network-based sound propagation method to generate acoustic effects for 3D models of real environments.
arXiv Detail & Related papers (2023-02-02T04:09:23Z) - SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning [127.1119359047849]
We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio rendering for 3D environments.
It generates highly realistic acoustics for arbitrary sounds captured from arbitrary microphone locations.
SoundSpaces 2.0 is publicly available to facilitate wider research for perceptual systems that can both see and hear.
arXiv Detail & Related papers (2022-06-16T17:17:44Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL
Sound Field Dataset [0.0]
This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms.
The paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones.
arXiv Detail & Related papers (2021-02-12T11:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.