Related papers: Modulation Extraction for LFO-driven Audio Effects

Modulation Extraction for LFO-driven Audio Effects

URL: http://arxiv.org/abs/2305.13262v1
Date: Mon, 22 May 2023 17:33:07 GMT
Title: Modulation Extraction for LFO-driven Audio Effects
Authors: Christopher Mitcheltree, Christian J. Steinmetz, Marco Comunit\`a, Joshua D. Reiss
Abstract summary: We propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. We show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects. We make our code available and provide the trained audio effect models in a real-time VST plugin.
Score: 5.740770499256802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Low frequency oscillator (LFO) driven audio effects such as phaser, flanger, and chorus, modify an input signal using time-varying filters and delays, resulting in characteristic sweeping or widening effects. It has been shown that these effects can be modeled using neural networks when conditioned with the ground truth LFO signal. However, in most cases, the LFO signal is not accessible and measurement from the audio signal is nontrivial, hindering the modeling process. To address this, we propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. Since our system imposes no restrictions on the LFO signal shape, we demonstrate its ability to extract quasiperiodic, combined, and distorted modulation signals that are relevant to effect modeling. Furthermore, we show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects using only dry and wet audio pairs, overcoming the need to access the audio effect or internal LFO signal. We make our code available and provide the trained audio effect models in a real-time VST plugin.

Related papers

Resampling Filter Design for Multirate Neural Audio Effect Processing [9.149661171430257]
We explore the use of signal resampling at the input and output of the neural network as an alternative solution. We show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method.
arXiv Detail & Related papers (2025-01-30T16:44:49Z)
CONMOD: Controllable Neural Frame-based Modulation Effects [6.132272910797383]
We introduce Controllable Neural Frame-based Modulation Effects (CONMOD), a single black-box model which emulates various LFO-driven effects in a frame-wise manner. The model is capable of learning the continuous embedding space of two distinct phaser effects, enabling us to steer between effects and achieve creative outputs.
arXiv Detail & Related papers (2024-06-20T02:02:54Z)
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion [84.138804145918]
Deep generative models can generate high-fidelity audio conditioned on various types of representations. These models are prone to generate audible artifacts when the conditioning is flawed or imperfect. We propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality from low-bitrate discrete representations.
arXiv Detail & Related papers (2023-08-02T22:14:29Z)
Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing [21.053861381437827]
This work presents a differentiable digital signal processing approach to modelling phaser effects. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters.
arXiv Detail & Related papers (2023-06-02T07:53:41Z)
Modelling black-box audio effects with time-varying feature modulation [13.378050193507907]
We show that scaling the width, depth, or dilation factor of existing architectures does not result in satisfactory performance when modelling audio effects such as fuzz and dynamic range compression. We propose the integration of time-varying feature-wise linear modulation into existing temporal convolutional backbones. We demonstrate that our approach more accurately captures long-range dependencies for a range of fuzz and compressor implementations across both time and frequency domain metrics.
arXiv Detail & Related papers (2022-11-01T14:41:57Z)
Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification. We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information. SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z)
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping [51.698273019061645]
SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram. It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
arXiv Detail & Related papers (2022-03-31T02:08:27Z)
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation [67.96138567288197]
We propose a novel neural vocoder named NeuralDPS which can retain high speech quality and acquire high synthesis efficiency and noise controllability. It generates waveforms at least 280 times faster than the WaveNet vocoder. It is also 28% faster than WaveGAN's synthesis efficiency on a single core.
arXiv Detail & Related papers (2022-03-05T08:15:29Z)
Point Cloud Audio Processing [18.88427891844357]
We introduce a novel way of processing audio signals by treating them as a collection of points in feature space. We observe that these methods result in smaller models, and allow us to significantly subsample the input representation with minimal effects to a trained model performance.
arXiv Detail & Related papers (2021-05-06T07:04:59Z)
Hierarchical Timbre-Painting and Articulation Generation [92.59388372914265]
We present a fast and high-fidelity method for music generation, based on specified f0 and loudness. The synthesized audio mimics the timbre and articulation of a target instrument.
arXiv Detail & Related papers (2020-08-30T05:27:39Z)
Exploring Quality and Generalizability in Parameterized Neural Audio Effects [0.0]
Deep neural networks have shown promise for music audio signal processing applications. Results to date have tended to be constrained by low sample rates, noise, narrow domains of signal types, and/or lack of parameterized controls. This work expands on prior research published on modeling nonlinear time-dependent signal processing effects.
arXiv Detail & Related papers (2020-06-10T00:52:08Z)
VaPar Synth -- A Variational Parametric Model for Audio Synthesis [78.3405844354125]
We present VaPar Synth - a Variational Parametric Synthesizer which utilizes a conditional variational autoencoder (CVAE) trained on a suitable parametric representation. We demonstrate our proposed model's capabilities via the reconstruction and generation of instrumental tones with flexible control over their pitch.
arXiv Detail & Related papers (2020-03-30T16:05:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.