Related papers: Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening

Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening

URL: http://arxiv.org/abs/2509.02571v1
Date: Wed, 20 Aug 2025 09:29:14 GMT
Title: Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening
Authors: Diego Di Carlo, Koyama Shoichi, Nugraha Aditya Arie, Fontaine Mathieu, Bando Yoshiaki, Yoshii Kazuyoshi,
Abstract summary: This paper investigates continuous representations of steering vectors over frequency and position of microphone and source for augmented listening.<n>We propose a physics-aware composite kernel that model the directional incoming waves and the subsequent scattering effect.
Score: 0.7778724782015985
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates continuous representations of steering vectors over frequency and position of microphone and source for augmented listening (e.g., spatial filtering and binaural rendering) with precise control of the sound field perceived by the user. Steering vectors have typically been used for representing the spatial characteristics of the sound field as a function of the listening position. The basic algebraic representation of steering vectors assuming an idealized environment cannot deal with the scattering effect of the sound field. One may thus collect a discrete set of real steering vectors measured in dedicated facilities and super-resolve (i.e., upsample) them. Recently, physics-aware deep learning methods have been effectively used for this purpose. Such deterministic super-resolution, however, suffers from the overfitting problem due to the non-uniform uncertainty over the measurement space. To solve this problem, we integrate an expressive representation based on the neural field (NF) into the principled probabilistic framework based on the Gaussian process (GP). Specifically, we propose a physics-aware composite kernel that model the directional incoming waves and the subsequent scattering effect. Our comprehensive comparative experiment showed the effectiveness of the proposed method under data insufficiency conditions. In downstream tasks such as speech enhancement and binaural rendering using the simulated data of the SPEAR challenge, the oracle performances were attained with less than ten times fewer measurements.

Related papers

Gradient-Enhanced Partitioned Gaussian Processes for Real-Time Quadrotor Dynamics Modeling [3.0132217482597277]
We present a quadrotor dynamics Gaussian Process (GP) with information that achieves real-time inference via state-space partitioning and approximation.<n>We generate a training dataset that captures aerodynamic effects, such as rotor-rotor interactions and apparent wind direction.<n>This framework provides an efficient foundation for real-time aerodynamic prediction and control algorithms in complex and unsteady environments.
arXiv Detail & Related papers (2026-02-13T00:00:51Z)
Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance [14.16697537117357]
We present a novel strategy utilizing a low-complexity tracking algorithm in the form of a particle filter instead.<n>We show how the autoregressive interplay between both algorithms drastically improves tracking accuracy and leads to strong enhancement performance.
arXiv Detail & Related papers (2025-07-03T16:54:56Z)
Collaborative Edge AI Inference over Cloud-RAN [37.3710464868215]
A cloud radio access network (Cloud-RAN) based collaborative edge AI inference architecture is proposed. Specifically, geographically distributed devices capture real-time noise-corrupted sensory data samples and extract the noisy local feature vectors. We allow each RRH receives local feature vectors from all devices over the same resource blocks simultaneously by leveraging an over-the-air computation (AirComp) technique. These aggregated feature vectors are quantized and transmitted to a central processor for further aggregation and downstream inference tasks.
arXiv Detail & Related papers (2024-04-09T04:26:16Z)
ELUQuant: Event-Level Uncertainty Quantification in Deep Inelastic Scattering [0.0]
We introduce a physics-informed Bayesian Neural Network (BNN) with flow approximated posteriors for detailed uncertainty quantification (UQ) at the physics event-level. Applying to Deep Inelastic Scattering (DIS) events, our model effectively extracts the kinematic variables $x$, $Q2$, and $y$. This detailed description of the underlying uncertainty proves invaluable for decision-making, especially in tasks like event filtering.
arXiv Detail & Related papers (2023-10-04T15:50:05Z)
Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene. Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion. Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z)
Score-based Diffusion Models in Function Space [137.70916238028306]
Diffusion models have recently emerged as a powerful framework for generative modeling.<n>This work introduces a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.<n>We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z)
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks [76.830358429947]
Impulse response estimation in high noise and in-the-wild settings is a challenging problem. We propose a novel framework for parameterizing and estimating impulse responses based on recent advances in neural representation learning.
arXiv Detail & Related papers (2022-02-07T18:57:23Z)
Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method. A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations. We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z)
Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models. We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise. Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)
Using deep learning to understand and mitigate the qubit noise environment [0.0]
We propose to address the challenge of extracting accurate noise spectra from time-dynamics measurements on qubits. We demonstrate a neural network based methodology that allows for extraction of the noise spectrum associated with any qubit surrounded by an arbitrary bath. Our results can be applied to a wide range of qubit platforms and provide a framework for improving qubit performance.
arXiv Detail & Related papers (2020-05-03T17:13:14Z)
Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals. Two main challenges are the complex acoustic environment and the real-time processing requirement. We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.