Binaural Rendering of Ambisonic Signals by Neural Networks
- URL: http://arxiv.org/abs/2211.02301v1
- Date: Fri, 4 Nov 2022 07:57:37 GMT
- Title: Binaural Rendering of Ambisonic Signals by Neural Networks
- Authors: Yin Zhu, Qiuqiang Kong, Junjie Shi, Shilei Liu, Xuzhou Ye, Ju-chiang
Wang, Junping Zhang
- Abstract summary: Experimental results show that neural networks outperform the conventional method in objective metrics and achieve comparable subjective metrics.
Our proposed system achieves an SDR of 7.32 and MOSs of 3.83, 3.58, 3.87, 3.58 in quality, timbre, localization, and immersion dimensions.
- Score: 28.056334728309423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binaural rendering of ambisonic signals is of broad interest to virtual
reality and immersive media. Conventional methods often require manually
measured Head-Related Transfer Functions (HRTFs). To address this issue, we
collect a paired ambisonic-binaural dataset and propose a deep learning
framework in an end-to-end manner. Experimental results show that neural
networks outperform the conventional method in objective metrics and achieve
comparable subjective metrics. To validate the proposed framework, we
experimentally explore different settings of the input features, model
structures, output features, and loss functions. Our proposed system achieves
an SDR of 7.32 and MOSs of 3.83, 3.58, 3.87, 3.58 in quality, timbre,
localization, and immersion dimensions.
Related papers
- Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios [6.643082745560234]
We propose a novel method for interpreting neural networks, focusing on convolutional neural network-based receiver model.
The method identifies which unit or units of the model contain most (or least) information about the channel parameter(s) of the interest.
Experiments on link-level simulations demonstrate the method's effectiveness in identifying units that contribute most (and least) to signal-to-noise ratio processing.
arXiv Detail & Related papers (2024-09-25T09:26:19Z) - Towards a Robust Framework for NeRF Evaluation [11.348562090906576]
We propose a new test framework which isolates the neural rendering network from the Neural Radiance Field (NeRF) pipeline.
We then perform a parametric evaluation by training and evaluating the NeRF on an explicit radiance field representation.
Our approach offers the potential to create a comparative objective evaluation framework for NeRF methods.
arXiv Detail & Related papers (2023-05-29T13:30:26Z) - On Neural Architectures for Deep Learning-based Source Separation of
Co-Channel OFDM Signals [104.11663769306566]
We study the single-channel source separation problem involving frequency-division multiplexing (OFDM) signals.
We propose critical domain-informed modifications to the network parameterization, based on insights from OFDM structures.
arXiv Detail & Related papers (2023-03-11T16:29:13Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - Domain Adaptation: the Key Enabler of Neural Network Equalizers in
Coherent Optical Systems [1.4549914190846531]
We introduce the domain adaptation and randomization approach for calibrating neural network-based equalizers for real transmissions.
The approach renders up to 99% training process reduction.
arXiv Detail & Related papers (2022-02-25T13:46:33Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.
We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z) - Score-informed Networks for Music Performance Assessment [64.12728872707446]
Deep neural network-based methods incorporating score information into MPA models have not yet been investigated.
We introduce three different models capable of score-informed performance assessment.
arXiv Detail & Related papers (2020-08-01T07:46:24Z) - Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment.
We implement this algorithm in a real-time robotic system with a microphone array.
The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z) - Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural
Networks [10.089520556398574]
We present a new single sound source DOA estimation and tracking system based on the SRP-PHAT algorithm and a three-dimensional Convolutional Neural Network.
It uses SRP-PHAT power maps as input features of a fully convolutional causal architecture that uses 3D convolutional layers to accurately perform the tracking of a sound source.
arXiv Detail & Related papers (2020-06-16T09:07:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.