HRTF Interpolation using a Spherical Neural Process Meta-Learner
- URL: http://arxiv.org/abs/2310.13430v1
- Date: Fri, 20 Oct 2023 11:41:54 GMT
- Title: HRTF Interpolation using a Spherical Neural Process Meta-Learner
- Authors: Etienne Thuillier and Craig Jin and Vesa V\"alim\"aki
- Abstract summary: We introduce a Convolutional Neural Process meta-learner specialized in HRTF error correction.
A generic population-mean HRTF forms the initial estimates prior to corrections.
The trained model achieves up to 3 dB relative error reduction compared to state-of-the-art methods.
- Score: 1.3505077405741583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Several individualization methods have recently been proposed to estimate a
subject's Head-Related Transfer Function (HRTF) using convenient input
modalities such as anthropometric measurements or pinnae photographs. There
exists a need for adaptively correcting the estimation error committed by such
methods using a few data point samples from the subject's HRTF, acquired using
acoustic measurements or perceptual feedback. To this end, we introduce a
Convolutional Conditional Neural Process meta-learner specialized in HRTF error
interpolation. In particular, the model includes a Spherical Convolutional
Neural Network component to accommodate the spherical geometry of HRTF data. It
also exploits potential symmetries between the HRTF's left and right channels
about the median axis. In this work, we evaluate the proposed model's
performance purely on time-aligned spectrum interpolation grounds under a
simplified setup where a generic population-mean HRTF forms the initial
estimates prior to corrections instead of individualized ones. The trained
model achieves up to 3 dB relative error reduction compared to state-of-the-art
interpolation methods despite being trained using only 85 subjects. This
improvement translates up to nearly a halving of the data point count required
to achieve comparable accuracy, in particular from 50 to 28 points to reach an
average of -20 dB relative error per interpolated feature. Moreover, we show
that the trained model provides well-calibrated uncertainty estimates.
Accordingly, such estimates can inform the sequential decision problem of
acquiring as few correcting HRTF data points as needed to meet a desired level
of HRTF individualization accuracy.
Related papers
- HRTF Estimation using a Score-based Prior [20.62078965099636]
We present a head-related transfer function estimation method based on a score-based diffusion model.
The HRTF is estimated in reverberant environments using natural excitation signals, e.g. human speech.
We show that the diffusion prior can account for the large variability of high-frequency content in HRTFs.
arXiv Detail & Related papers (2024-10-02T14:00:41Z) - SPDE priors for uncertainty quantification of end-to-end neural data
assimilation schemes [4.213142548113385]
Recent advances in the deep learning community enables to adress this problem as neural architecture embedding data assimilation variational framework.
In this work, we draw from SPDE-based Processes to estimate prior models able to handle non-stationary covariances in both space and time.
Our neural variational scheme is modified to embed an augmented state formulation with both state SPDE parametrization to estimate.
arXiv Detail & Related papers (2024-02-02T19:18:12Z) - DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z) - Mitigating Dataset Bias by Using Per-sample Gradient [9.290757451344673]
We propose PGD (Per-sample Gradient-based Debiasing), that comprises three steps: training a model on uniform batch sampling, setting the importance of each sample in proportion to the norm of the sample gradient, and training the model using importance-batch sampling.
Compared with existing baselines for various synthetic and real-world datasets, the proposed method showed state-of-the-art accuracy for a the classification task.
arXiv Detail & Related papers (2022-05-31T11:41:02Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Optimal Model Averaging: Towards Personalized Collaborative Learning [0.0]
In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node.
One such approach is weighted averaging between a locally trained model and the global model.
We find that there is always some positive amount of model averaging that reduces the expected squared error compared to the local model.
arXiv Detail & Related papers (2021-10-25T13:33:20Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Fast Uncertainty Quantification for Deep Object Pose Estimation [91.09217713805337]
Deep learning-based object pose estimators are often unreliable and overconfident.
In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation.
arXiv Detail & Related papers (2020-11-16T06:51:55Z) - On Minimum Word Error Rate Training of the Hybrid Autoregressive
Transducer [40.63693071222628]
We study the minimum word error rate (MWER) training of Hybrid Autoregressive Transducer (HAT)
From experiments with around 30,000 hours of training data, we show that MWER training can improve the accuracy of HAT models.
arXiv Detail & Related papers (2020-10-23T21:16:30Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.