Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs
- URL: http://arxiv.org/abs/2203.16940v1
- Date: Thu, 31 Mar 2022 10:52:19 GMT
- Title: Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs
- Authors: David Diaz-Guerra, Antonio Miguel, Jose R. Beltran
- Abstract summary: We present a new model for Direction of Arrival (DOA) estimation of sound sources based on an Icosahedral Convolutional Neural Network (CNN)
This icosahedral CNN is equivariant to the 60 rotational symmetries of the icosahedron, which represent a good approximation of the continuous space of spherical rotations.
We prove that using models that fit the equivariances of the problem allows us to outperform other state-of-the-art models with a lower computational cost and more robustness.
- Score: 10.089520556398574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a new model for Direction of Arrival (DOA)
estimation of sound sources based on an Icosahedral Convolutional Neural
Network (CNN) applied over SRP-PHAT power maps computed from the signals
received by a microphone array. This icosahedral CNN is equivariant to the 60
rotational symmetries of the icosahedron, which represent a good approximation
of the continuous space of spherical rotations, and can be implemented using
standard 2D convolutional layers, having a lower computational cost than most
of the spherical CNNs. In addition, instead of using fully connected layers
after the icosahedral convolutions, we propose a new soft-argmax function that
can be seen as a differentiable version of the argmax function and allows us to
solve the DOA estimation as a regression problem interpreting the output of the
convolutional layers as a probability distribution. We prove that using models
that fit the equivariances of the problem allows us to outperform other
state-of-the-art models with a lower computational cost and more robustness,
obtaining root mean square localization errors lower than 10{\deg} even in
scenarios with a reverberation time $T_{60}$ of 1.5 s.
Related papers
- A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems.
The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients.
Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix.
We train a CNN in the low-SNR regime to predict DoAs across all SNRs.
Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z) - Lightning-Fast Gravitational Wave Parameter Inference through Neural
Amortization [6.810835072367285]
Latest advances in neural simulation-based inference can speed up the inference time by up to three orders of magnitude.
We find that our model correctly estimates credible intervals for the parameters of simulated gravitational waves.
arXiv Detail & Related papers (2020-10-24T16:48:24Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Baryon acoustic oscillations reconstruction using convolutional neural
networks [1.9262162668141078]
We propose a new scheme to reconstruct the baryon acoustic oscillations (BAO) signal, which contains key cosmological information, based on deep convolutional neural networks (CNN)
We find that the network trained in one cosmology is able to reconstruct BAO peaks in the others, i.e. recovering information lost to non-linearity independent of cosmology.
arXiv Detail & Related papers (2020-02-24T13:18:31Z) - Gravitational-wave parameter estimation with autoregressive neural
network flows [0.0]
We introduce the use of autoregressive normalizing flows for rapid likelihood-free inference of binary black hole system parameters from gravitational-wave data with deep neural networks.
A normalizing flow is an invertible mapping on a sample space that can be used to induce a transformation from a simple probability distribution to a more complex one.
We build a more powerful latent variable model by incorporating autoregressive flows within the variational autoencoder framework.
arXiv Detail & Related papers (2020-02-18T15:44:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.