Related papers: Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

URL: http://arxiv.org/abs/2306.01332v1
Date: Fri, 2 Jun 2023 07:53:41 GMT
Title: Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing
Authors: Alistair Carson, Cassia Valentini-Botinhao, Simon King, Stefan Bilbao
Abstract summary: This work presents a differentiable digital signal processing approach to modelling phaser effects. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters.
Score: 21.053861381437827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning approaches to modelling analog audio effects have seen intensive investigation in recent years, particularly in the context of non-linear time-invariant effects such as guitar amplifiers. For modulation effects such as phasers, however, new challenges emerge due to the presence of the low-frequency oscillator which controls the slowly time-varying nature of the effect. Existing approaches have either required foreknowledge of this control signal, or have been non-causal in implementation. This work presents a differentiable digital signal processing approach to modelling phaser effects in which the underlying control signal and time-varying spectral response of the effect are jointly learned. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain, with a transfer function based on typical analog phaser circuit topology. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters. The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy. The optimal frame length depends on both the rate and transient decay-time of the target effect, but the frame length can be altered at inference time without a significant change in accuracy.

Related papers

MFRS: A Multi-Frequency Reference Series Approach to Scalable and Accurate Time-Series Forecasting [51.94256702463408]
Time series predictability is derived from periodic characteristics at different frequencies. We propose a novel time series forecasting method based on multi-frequency reference series correlation analysis. Experiments on major open and synthetic datasets show state-of-the-art performance.
arXiv Detail & Related papers (2025-03-11T11:40:14Z)
Resampling Filter Design for Multirate Neural Audio Effect Processing [9.149661171430257]
We explore the use of signal resampling at the input and output of the neural network as an alternative solution. We show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method.
arXiv Detail & Related papers (2025-01-30T16:44:49Z)
Comparative Study of State-based Neural Networks for Virtual Analog Audio Effects Modeling [0.0]
This article explores the application of machine learning advancements for virtual analog modeling. We compare State-Space models and Linear Recurrent Units against the more common Long Short-Term Memory networks.
arXiv Detail & Related papers (2024-05-07T08:47:40Z)
TSLANet: Rethinking Transformers for Time Series Representation Learning [19.795353886621715]
Time series data is characterized by its intrinsic long and short-range dependencies. We introduce a novel Time Series Lightweight Network (TSLANet) as a universal convolutional model for diverse time series tasks. Our experiments demonstrate that TSLANet outperforms state-of-the-art models in various tasks spanning classification, forecasting, and anomaly detection.
arXiv Detail & Related papers (2024-04-12T13:41:29Z)
Modulation Extraction for LFO-driven Audio Effects [5.740770499256802]
We propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. We show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects. We make our code available and provide the trained audio effect models in a real-time VST plugin.
arXiv Detail & Related papers (2023-05-22T17:33:07Z)
Digital noise spectroscopy with a quantum sensor [57.53000001488777]
We introduce and experimentally demonstrate a quantum sensing protocol to sample and reconstruct the auto-correlation of a noise process. Walsh noise spectroscopy method exploits simple sequences of spin-flip pulses to generate a complete basis of digital filters. We experimentally reconstruct the auto-correlation function of the effective magnetic field produced by the nuclear-spin bath on the electronic spin of a single nitrogen-vacancy center in diamond.
arXiv Detail & Related papers (2022-12-19T02:19:35Z)
Modelling black-box audio effects with time-varying feature modulation [13.378050193507907]
We show that scaling the width, depth, or dilation factor of existing architectures does not result in satisfactory performance when modelling audio effects such as fuzz and dynamic range compression. We propose the integration of time-varying feature-wise linear modulation into existing temporal convolutional backbones. We demonstrate that our approach more accurately captures long-range dependencies for a range of fuzz and compressor implementations across both time and frequency domain metrics.
arXiv Detail & Related papers (2022-11-01T14:41:57Z)
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping [51.698273019061645]
SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram. It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
arXiv Detail & Related papers (2022-03-31T02:08:27Z)
Fast and differentiable simulation of driven quantum systems [58.720142291102135]
We introduce a semi-analytic method based on the Dyson expansion that allows us to time-evolve driven quantum systems much faster than standard numerical methods. We show results of the optimization of a two-qubit gate using transmon qubits in the circuit QED architecture.
arXiv Detail & Related papers (2020-12-16T21:43:38Z)
Direct phase modulation via optical injection: theoretical study [50.591267188664666]
We study the influence of the spontaneous emission noise, examine the role of the gain non-linearity and consider the effect of the temperature drift. We have tried to formulate here practical instructions, which will help to take these features into account when elaborating and employing the optical-injection-based phase modulator.
arXiv Detail & Related papers (2020-11-18T13:20:04Z)
Real Time Speech Enhancement in the Waveform Domain [99.02180506016721]
We present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is capable of removing various kinds of background noise including stationary and non-stationary noises.
arXiv Detail & Related papers (2020-06-23T09:19:13Z)
Exploring Quality and Generalizability in Parameterized Neural Audio Effects [0.0]
Deep neural networks have shown promise for music audio signal processing applications. Results to date have tended to be constrained by low sample rates, noise, narrow domains of signal types, and/or lack of parameterized controls. This work expands on prior research published on modeling nonlinear time-dependent signal processing effects.
arXiv Detail & Related papers (2020-06-10T00:52:08Z)
VaPar Synth -- A Variational Parametric Model for Audio Synthesis [78.3405844354125]
We present VaPar Synth - a Variational Parametric Synthesizer which utilizes a conditional variational autoencoder (CVAE) trained on a suitable parametric representation. We demonstrate our proposed model's capabilities via the reconstruction and generation of instrumental tones with flexible control over their pitch.
arXiv Detail & Related papers (2020-03-30T16:05:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.