Rigid-Body Sound Synthesis with Differentiable Modal Resonators
- URL: http://arxiv.org/abs/2210.15306v2
- Date: Fri, 28 Oct 2022 11:47:41 GMT
- Title: Rigid-Body Sound Synthesis with Differentiable Modal Resonators
- Authors: Rodrigo Diaz, Ben Hayes, Charalampos Saitis, Gy\"orgy Fazekas, Mark
Sandler
- Abstract summary: We present a novel end-to-end framework for training a deep neural network to generate modal resonators for a given 2D shape and material.
We demonstrate our method on a dataset of synthetic objects, but train our model using an audio-domain objective.
- Score: 6.680437329908454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Physical models of rigid bodies are used for sound synthesis in applications
from virtual environments to music production. Traditional methods such as
modal synthesis often rely on computationally expensive numerical solvers,
while recent deep learning approaches are limited by post-processing of their
results. In this work we present a novel end-to-end framework for training a
deep neural network to generate modal resonators for a given 2D shape and
material, using a bank of differentiable IIR filters. We demonstrate our method
on a dataset of synthetic objects, but train our model using an audio-domain
objective, paving the way for physically-informed synthesisers to be learned
directly from recordings of real-world objects.
Related papers
- Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation [17.03776191787701]
We introduce a novel model for simulating motion properties of nonlinear strings.
We integrate modal synthesis and spectral modeling within physical network framework.
Empirical evaluations demonstrate that the architecture achieves superior accuracy in string motion simulation.
arXiv Detail & Related papers (2024-07-07T23:36:51Z) - AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis [62.33446681243413]
view acoustic synthesis aims to render audio at any target viewpoint, given a mono audio emitted by a sound source at a 3D scene.
Existing methods have proposed NeRF-based implicit models to exploit visual cues as a condition for synthesizing audio.
We propose a novel Audio-Visual Gaussian Splatting (AV-GS) model to characterize the entire scene environment.
Experiments validate the superiority of our AV-GS over existing alternatives on the real-world RWAS and simulation-based SoundSpaces datasets.
arXiv Detail & Related papers (2024-06-13T08:34:12Z) - Contrastive Learning from Synthetic Audio Doppelgangers [1.3754952818114714]
We propose a solution to both the data scale and transformation limitations, leveraging synthetic audio.
By randomly perturbing the parameters of a sound synthesizer, we generate audio doppelg"angers-synthetic positive pairs with causally manipulated variations in timbre, pitch, and temporal envelopes.
Despite the shift to randomly generated synthetic data, our method produces strong representations, competitive with real data on standard audio classification benchmarks.
arXiv Detail & Related papers (2024-06-09T21:44:06Z) - Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities.
RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z) - Leaping Into Memories: Space-Time Deep Feature Synthesis [93.10032043225362]
We propose LEAPS, an architecture-independent method for synthesizing videos from internal models.
We quantitatively and qualitatively evaluate the applicability of LEAPS by inverting a range of architectures convolutional attention-based on Kinetics-400.
arXiv Detail & Related papers (2023-03-17T12:55:22Z) - Synthetic Wave-Geometric Impulse Responses for Improved Speech
Dereverberation [69.1351513309953]
We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation.
We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods.
arXiv Detail & Related papers (2022-12-10T20:15:23Z) - Audio representations for deep learning in sound synthesis: A review [0.0]
This paper provides an overview of audio representations applied to sound synthesis using deep learning.
It also presents the most significant methods for developing and evaluating a sound synthesis architecture using deep learning models.
arXiv Detail & Related papers (2022-01-07T15:08:47Z) - Learning to Segment Human Body Parts with Synthetically Trained Deep
Convolutional Networks [58.0240970093372]
This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data.
The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts.
arXiv Detail & Related papers (2021-02-02T12:26:50Z) - MTCRNN: A multi-scale RNN for directed audio texture synthesis [0.0]
We introduce a novel modelling approach for textures, combining recurrent neural networks trained at different levels of abstraction with a conditioning strategy that allows for user-directed synthesis.
We demonstrate the model's performance on a variety of datasets, examine its performance on various metrics, and discuss some potential applications.
arXiv Detail & Related papers (2020-11-25T09:13:53Z) - Deep generative models for musical audio synthesis [0.0]
Sound modelling is the process of developing algorithms that generate sound under parametric control.
Recent generative deep learning systems for audio synthesis are able to learn models that can traverse arbitrary spaces of sound.
This paper is a review of developments in deep learning that are changing the practice of sound modelling.
arXiv Detail & Related papers (2020-06-10T04:02:42Z) - VaPar Synth -- A Variational Parametric Model for Audio Synthesis [78.3405844354125]
We present VaPar Synth - a Variational Parametric Synthesizer which utilizes a conditional variational autoencoder (CVAE) trained on a suitable parametric representation.
We demonstrate our proposed model's capabilities via the reconstruction and generation of instrumental tones with flexible control over their pitch.
arXiv Detail & Related papers (2020-03-30T16:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.