Related papers: Instabilities in Convnets for Raw Audio

Instabilities in Convnets for Raw Audio

URL: http://arxiv.org/abs/2309.05855v4
Date: Fri, 26 Apr 2024 08:25:12 GMT
Title: Instabilities in Convnets for Raw Audio
Authors: Daniel Haider, Vincent Lostanlen, Martin Ehler, Peter Balazs,
Abstract summary: We present a theory of large deviations for the energy response of FIR filterbanks with random Gaussian weights. We find that deviations worsen for large filters and locally periodic input signals. Numerical simulations align with our theory and suggest that the condition number of a convolutional layer follows a logarithmic scaling law.
Score: 1.5060156580765574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: What makes waveform-based deep learning so hard? Despite numerous attempts at training convolutional neural networks (convnets) for filterbank design, they often fail to outperform hand-crafted baselines. These baselines are linear time-invariant systems: as such, they can be approximated by convnets with wide receptive fields. Yet, in practice, gradient-based optimization leads to suboptimal approximations. In our article, we approach this phenomenon from the perspective of initialization. We present a theory of large deviations for the energy response of FIR filterbanks with random Gaussian weights. We find that deviations worsen for large filters and locally periodic input signals, which are both typical for audio signal processing applications. Numerical simulations align with our theory and suggest that the condition number of a convolutional layer follows a logarithmic scaling law between the number and length of the filters, which is reminiscent of discrete wavelet bases.

Related papers

LOGLO-FNO: Efficient Learning of Local and Global Features in Fourier Neural Operators [20.77877474840923]
High-frequency information is a critical challenge in machine learning. Deep neural nets exhibit the so-called spectral bias toward learning low-frequency components. We propose a novel frequency-sensitive loss term based on radially binned spectral errors.
arXiv Detail & Related papers (2025-04-05T19:35:04Z)
Fourier PINNs: From Strong Boundary Conditions to Adaptive Fourier Bases [22.689531776611084]
We study a strong Boundary Condition (BC) version of PINNs for Dirichlet BCs. We find that strong BC PINNs can better learn the amplitudes of high-frequency components of the target solutions. We propose Fourier PINNs -- a simple, general, yet powerful method that augments PINNs with pre-specified, dense Fourier bases.
arXiv Detail & Related papers (2024-10-04T15:10:22Z)
Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z)
Fitting Auditory Filterbanks with Multiresolution Neural Networks [4.944919495794613]
We introduce a neural audio model, named multiresolution neural network (MuReNN) The key idea behind MuReNN is to train separate convolutional operators over the octave subbands of a discrete wavelet transform (DWT) For a given real-world dataset, we fit the magnitude response of MuReNN to that of a well-established auditory filterbank.
arXiv Detail & Related papers (2023-07-25T21:20:12Z)
On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural Networks [0.0]
Subsampled convolutions with Gabor-like filters are prone to aliasing, causing sensitivity to small input shifts. We highlight the crucial role played by the filter's frequency and orientation in achieving stability. We experimentally validate our theory by considering a deterministic feature extractor based on the dual-tree complex wavelet packet transform.
arXiv Detail & Related papers (2022-09-19T08:15:30Z)
Computational Doob's h-transforms for Online Filtering of Discretely Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms. The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z)
Deep Learning for the Benes Filter [91.3755431537592]
We present a new numerical method based on the mesh-free neural network representation of the density of the solution of the Benes model. We discuss the role of nonlinearity in the filtering model equations for the choice of the domain of the neural network.
arXiv Detail & Related papers (2022-03-09T14:08:38Z)
The Pseudo Projection Operator: Applications of Deep Learning to Projection Based Filtering in Non-Trivial Frequency Regimes [5.632784019776093]
We introduce a PO-neural network hybrid model, the Pseudo Projection Operator (PPO), which leverages a neural network to perform frequency selection. We compare the filtering capabilities of a PPO, PO, and denoising autoencoder (DAE) on the University of Rochester Multi-Modal Music Performance dataset. In the majority of experiments, the PPO outperforms both the PO and DAE.
arXiv Detail & Related papers (2021-11-13T16:09:14Z)
Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes [71.23286211775084]
We propose an adaptive low-pass filter based on Gaussian process regression. We show that the estimation error of the proposed method is uniformly bounded.
arXiv Detail & Related papers (2021-11-05T17:06:59Z)
Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.