Related papers: MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

URL: http://arxiv.org/abs/2205.09248v1
Date: Wed, 18 May 2022 23:50:34 GMT
Title: MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes
Authors: Anton Ratnarajah, Zhenyu Tang, Rohith Chandrashekar Aralikatti, Dinesh Manocha
Abstract summary: We propose a mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh. Our method can handle input triangular meshes with arbitrary topologies (2K - 3M triangles) We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error.
Score: 56.946057850725545
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh. The IRs are used to create a high-quality sound experience in interactive applications and audio processing. Our method can handle input triangular meshes with arbitrary topologies (2K - 3M triangles). We present a novel training technique to train MESH2IR using energy decay relief and highlight its benefits. We also show that training MESH2IR on IRs preprocessed using our proposed technique significantly improves the accuracy of IR generation. We reduce the non-linearity in the mesh space by transforming 3D scene meshes to latent space using a graph convolution network. Our MESH2IR is more than 200 times faster than a geometric acoustic algorithm on a CPU and can generate more than 10,000 IRs per second on an NVIDIA GeForce RTX 2080 Ti GPU for a given furnished indoor 3D scene. The acoustic metrics are used to characterize the acoustic environment. We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error. We also highlight the benefits of MESH2IR on audio and speech processing applications such as speech dereverberation and speech separation. To the best of our knowledge, ours is the first neural-network-based approach to predict IRs from a given 3D scene mesh in real-time.

Related papers

ReMiDi: Reconstruction of Microstructure Using a Differentiable Diffusion MRI Simulator [0.602276990341246]
ReMiDi is a novel method for inferring neuronal microstructure as arbitrary 3D meshes using a differentiable diffusion Magnetic Resonance Imaging (dMRI) simulator. We present an end-to-end differentiable pipeline that simulates signals that can be tuned to match a reference signal. We demonstrate the ability to reconstruct microstructures of arbitrary shapes represented by finite-element meshes, with a focus on axonal geometries found in the brain white matter.
arXiv Detail & Related papers (2025-02-04T04:03:08Z)
VR-Splatting: Foveated Radiance Field Rendering via 3D Gaussian Splatting and Neural Points [4.962171160815189]
We propose a novel hybrid approach that combines the strengths of both point rendering directions regarding performance sweet spots. For the fovea only, we use neural points with a convolutional neural network for the small pixel footprint, which provides sharp, detailed output. Our evaluation confirms that our approach increases sharpness and details compared to a standard VR-ready 3DGS configuration.
arXiv Detail & Related papers (2024-10-23T14:54:48Z)
A Refined 3D Gaussian Representation for High-Quality Dynamic Scene Reconstruction [2.022451212187598]
In recent years, Neural Radiance Fields (NeRF) has revolutionized three-dimensional (3D) reconstruction with its implicit representation. 3D Gaussian Splatting (3D-GS) has departed from the implicit representation of neural networks and instead directly represents scenes as point clouds with Gaussian-shaped distributions. This paper purposes a refined 3D Gaussian representation for high-quality dynamic scene reconstruction. Experimental results demonstrate that our method surpasses existing approaches in rendering quality and speed, while significantly reducing the memory usage associated with 3D-GS.
arXiv Detail & Related papers (2024-05-28T07:12:22Z)
EM-GANSim: Real-time and Accurate EM Simulation Using Conditional GANs for 3D Indoor Scenes [55.2480439325792]
We present a novel machine-learning (ML) approach (EM-GANSim) for real-time electromagnetic (EM) propagation. In practice, it can compute the signal strength in a few milliseconds on any location in 3D indoor environments.
arXiv Detail & Related papers (2024-05-27T17:19:02Z)
N-BVH: Neural ray queries with bounding volume hierarchies [51.430495562430565]
In 3D computer graphics, the bulk of a scene's memory usage is due to polygons and textures. We devise N-BVH, a neural compression architecture designed to answer arbitrary ray queries in 3D. Our method provides faithful approximations of visibility, depth, and appearance attributes.
arXiv Detail & Related papers (2024-05-25T13:54:34Z)
Utilizing Machine Learning and 3D Neuroimaging to Predict Hearing Loss: A Comparative Analysis of Dimensionality Reduction and Regression Techniques [0.0]
We have explored machine learning approaches for predicting hearing loss thresholds on the brain's gray matter 3D images. In the first phase, we used a 3D CNN model to reduce high-dimensional input into latent space. In the second phase, we utilized this model to reduce input into rich features.
arXiv Detail & Related papers (2024-04-30T18:39:41Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes [69.03289331433874]
We present an end-to-end audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications. We propose a novel neural-network-based sound propagation method to generate acoustic effects for 3D models of real environments.
arXiv Detail & Related papers (2023-02-02T04:09:23Z)
FAST-RIR: Fast neural diffuse room impulse response generator [81.96114823691343]
We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment. Our FAST-RIR is capable of generating RIRs for a given input reverberation time with an average error of 0.02s. We show that our proposed FAST-RIR with batch size 1 is 400 times faster than a state-of-the-art diffuse acoustic simulator (DAS) on a CPU.
arXiv Detail & Related papers (2021-10-07T05:21:01Z)
DONeRF: Towards Real-Time Rendering of Neural Radiance Fields using Depth Oracle Networks [6.2444658061424665]
DONeRF is a dual network design with a depth oracle network as a first step and a locally sampled shading network for ray accumulation. We are the first to render raymarching-based neural representations at interactive frame rates (15 frames per second at 800x800) on a single GPU.
arXiv Detail & Related papers (2021-03-04T18:55:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.