Resampling Filter Design for Multirate Neural Audio Effect Processing
- URL: http://arxiv.org/abs/2501.18470v1
- Date: Thu, 30 Jan 2025 16:44:49 GMT
- Title: Resampling Filter Design for Multirate Neural Audio Effect Processing
- Authors: Alistair Carson, Vesa Välimäki, Alec Wright, Stefan Bilbao,
- Abstract summary: We explore the use of signal resampling at the input and output of the neural network as an alternative solution.
We show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method.
- Score: 9.149661171430257
- License:
- Abstract: Neural networks have become ubiquitous in audio effects modelling, especially for guitar amplifiers and distortion pedals. One limitation of such models is that the sample rate of the training data is implicitly encoded in the model weights and therefore not readily adjustable at inference. Recent work explored modifications to recurrent neural network architecture to approximate a sample rate independent system, enabling audio processing at a rate that differs from the original training rate. This method works well for integer oversampling and can reduce aliasing caused by nonlinear activation functions. For small fractional changes in sample rate, fractional delay filters can be used to approximate sample rate independence, but in some cases this method fails entirely. Here, we explore the use of signal resampling at the input and output of the neural network as an alternative solution. We investigate several resampling filter designs and show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method with many fewer operations per sample and less than one millisecond of latency at typical audio rates. Furthermore, we investigate interpolation and decimation filters for the task of integer oversampling and show that cascaded half-band IIR and FIR designs can be used in conjunction with the model adjustment method to reduce aliasing in a range of distortion effect models.
Related papers
- Neural Flow Samplers with Shortcut Models [19.81513273510523]
Flow-based samplers generate samples by learning a velocity field that satisfies the continuity equation.
While importance sampling provides an approximation, it suffers from high variance.
arXiv Detail & Related papers (2025-02-11T07:55:41Z) - Residual Channel Boosts Contrastive Learning for Radio Frequency Fingerprint Identification [17.98760668117099]
This paper proposes a residual channel-based data augmentation strategy for Radio Frequency Fingerprint Identification (RFFI)
We show that our method significantly enhances both feature extraction ability and generalization while requiring fewer samples and less time.
arXiv Detail & Related papers (2024-12-12T02:48:20Z) - Filter Pruning for Efficient CNNs via Knowledge-driven Differential
Filter Sampler [103.97487121678276]
Filter pruning simultaneously accelerates the computation and reduces the memory overhead of CNNs.
We propose a novel Knowledge-driven Differential Filter Sampler(KDFS) with Masked Filter Modeling(MFM) framework for filter pruning.
arXiv Detail & Related papers (2023-07-01T02:28:41Z) - Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion [85.54515118077825]
This paper proposes a linear diffusion model (LinDiff) based on an ordinary differential equation to simultaneously reach fast inference and high sample quality.
To reduce computational complexity, LinDiff employs a patch-based processing approach that partitions the input signal into small patches.
Our model can synthesize speech of a quality comparable to that of autoregressive models with faster synthesis speed.
arXiv Detail & Related papers (2023-06-09T07:02:43Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification.
We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information.
SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z) - ProDiff: Progressive Fast Diffusion Model For High-Quality
Text-to-Speech [63.780196620966905]
We propose ProDiff, on progressive fast diffusion model for high-quality text-to-speech.
ProDiff parameterizes the denoising model by directly predicting clean data to avoid distinct quality degradation in accelerating sampling.
Our evaluation demonstrates that ProDiff needs only 2 iterations to synthesize high-fidelity mel-spectrograms.
ProDiff enables a sampling speed of 24x faster than real-time on a single NVIDIA 2080Ti GPU.
arXiv Detail & Related papers (2022-07-13T17:45:43Z) - A neural network-supported two-stage algorithm for lightweight
dereverberation on hearing devices [13.49645012479288]
A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper.
The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter.
Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs)
arXiv Detail & Related papers (2022-04-06T11:08:28Z) - Low Pass Filter for Anti-aliasing in Temporal Action Localization [15.139834271977913]
This paper aims to verify the existence of aliasing in temporal action localization methods.
It investigates utilizing low pass filters to solve this problem by inhibiting the high-frequency band.
Experiments demonstrate that anti-aliasing with low pass filters in TAL is advantageous and efficient.
arXiv Detail & Related papers (2021-04-23T03:57:34Z) - Anytime Sampling for Autoregressive Models via Ordered Autoencoding [88.01906682843618]
Autoregressive models are widely used for tasks such as image and audio generation.
The sampling process of these models does not allow interruptions and cannot adapt to real-time computational resources.
We propose a new family of autoregressive models that enables anytime sampling.
arXiv Detail & Related papers (2021-02-23T05:13:16Z) - Towards Differentiable Resampling [22.92540370475242]
We present a novel network architecture, the particle transformer, and train it for particle resampling using a likelihood-based loss function over sets of particles.
Our results show that our learned resampler outperforms traditional resampling techniques on synthetic data and in a simulated robot localization task.
arXiv Detail & Related papers (2020-04-24T18:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.