Related papers: Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations

Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations

URL: http://arxiv.org/abs/2111.08462v1
Date: Sun, 14 Nov 2021 13:36:18 GMT
Title: Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations
Authors: Jan Zuiderveld, Marco Federici, Erik J. Bekkers
Abstract summary: Implicit neural representations (INRs) are neural networks used to approximate low-dimensional functions. In this work we shed light on the potential of Conditional Implicit Neural Representations (CINRs) as lightweight backbones in generative frameworks for audio synthesis.
Score: 10.484851004093919
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The high temporal resolution of audio and our perceptual sensitivity to small irregularities in waveforms make synthesizing at high sampling rates a complex and computationally intensive task, prohibiting real-time, controllable synthesis within many approaches. In this work we aim to shed light on the potential of Conditional Implicit Neural Representations (CINRs) as lightweight backbones in generative frameworks for audio synthesis. Implicit neural representations (INRs) are neural networks used to approximate low-dimensional functions, trained to represent a single geometric object by mapping input coordinates to structural information at input locations. In contrast with other neural methods for representing geometric objects, the memory required to parameterize the object is independent of resolution, and only scales with its complexity. A corollary of this is that INRs have infinite resolution, as they can be sampled at arbitrary resolutions. To apply the concept of INRs in the generative domain we frame generative modelling as learning a distribution of continuous functions. This can be achieved by introducing conditioning methods to INRs. Our experiments show that Periodic Conditional INRs (PCINRs) learn faster and generally produce quantitatively better audio reconstructions than Transposed Convolutional Neural Networks with equal parameter counts. However, their performance is very sensitive to activation scaling hyperparameters. When learning to represent more uniform sets, PCINRs tend to introduce artificial high-frequency components in reconstructions. We validate this noise can be minimized by applying standard weight regularization during training or decreasing the compositional depth of PCINRs, and suggest directions for future research.

Related papers

F-INR: Functional Tensor Decomposition for Implicit Neural Representations [7.183424522250937]
Implicit Representation (INR) has emerged as a powerful tool for encoding discrete signals into continuous, differentiable functions using neural networks. We propose F-INR, a framework that reformulates INR learning through functional decomposition, breaking down high-dimensional tasks into lightweight, axis-specific sub-networks.
arXiv Detail & Related papers (2025-03-27T13:51:31Z)
Meta-INR: Efficient Encoding of Volumetric Data via Meta-Learning Implicit Neural Representation [4.782024723712711]
Implicit neural representation (INR) has emerged as a promising solution for encoding volumetric data. We propose Meta-INR, a pretraining strategy adapted from meta-learning algorithms to learn initial INR parameters from partial observation of a dataset. We demonstrate that Meta-INR can effectively extract high-quality generalizable features that help encode unseen similar volume data across diverse datasets.
arXiv Detail & Related papers (2025-02-12T21:54:22Z)
Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks [1.5124439914522694]
We introduce a theoretical framework that explains the capacity property of sinusoidal networks. We show how its layer compositions produce a large number of new frequencies expressed as integer combinations of the input frequencies. Our method, referred to as TUNER, greatly improves the stability and convergence of sinusoidal INR training, leading to detailed reconstructions.
arXiv Detail & Related papers (2024-07-30T18:24:46Z)
Towards a Sampling Theory for Implicit Neural Representations [0.3222802562733786]
Implicit neural representations (INRs) have emerged as a powerful tool for solving inverse problems in computer and computational imaging. We show how to recover images from a hidden-layer INR using a generalized form of weight decay regularization. We empirically assess the probability of achieving exact recovery images realized by low-width single-layer INRs, and illustrate the performance of INR on super-resolution recovery of more realistic continuous domain phantom images.
arXiv Detail & Related papers (2024-05-28T17:53:47Z)
INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings [4.639495398851869]
Implicit Neural Representations (INRs) have revolutionized signal representation by leveraging neural networks to provide continuous and smooth representations of complex data. We introduce INCODE, a novel approach that enhances the control of the sinusoidal-based activation function in INRs using deep prior knowledge. Our approach not only excels in representation, but also extends its prowess to tackle complex tasks such as audio, image, and 3D shape reconstructions.
arXiv Detail & Related papers (2023-10-28T23:16:49Z)
FFEINR: Flow Feature-Enhanced Implicit Neural Representation for Spatio-temporal Super-Resolution [4.577685231084759]
This paper proposes a Feature-Enhanced Neural Implicit Representation (FFEINR) for super-resolution of flow field data. It can take full advantage of the implicit neural representation in terms of model structure and sampling resolution. The training process of FFEINR is facilitated by introducing feature enhancements for the input layer.
arXiv Detail & Related papers (2023-08-24T02:28:18Z)
Degradation-Noise-Aware Deep Unfolding Transformer for Hyperspectral Image Denoising [9.119226249676501]
Hyperspectral images (HSIs) are often quite noisy because of narrow band spectral filtering. To reduce the noise in HSI data cubes, both model-driven and learning-based denoising algorithms have been proposed. This paper proposes a Degradation-Noise-Aware Unfolding Network (DNA-Net) that addresses these issues.
arXiv Detail & Related papers (2023-05-06T13:28:20Z)
Modality-Agnostic Variational Compression of Implicit Neural Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR) Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z)
Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions. Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost. We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z)
Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons. Existing works manipulate such continuous representations via processing on their discretized instance. We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z)
UNeRF: Time and Memory Conscious U-Shaped Network for Training Neural Radiance Fields [16.826691448973367]
Neural Radiance Fields (NeRFs) increase reconstruction detail for novel view synthesis and scene reconstruction. However, the increased resolution and model-free nature of such neural fields come at the cost of high training times and excessive memory requirements. We propose a method to exploit the redundancy of NeRF's sample-based computations by partially sharing evaluations across neighboring sample points.
arXiv Detail & Related papers (2022-06-23T19:57:07Z)
InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [55.70938412352287]
We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural implicit representation. The proposed approach minimizes potential reconstruction inconsistency that happens due to insufficient viewpoints. We achieve consistently improved performance compared to existing neural view synthesis methods by large margins on multiple standard benchmarks.
arXiv Detail & Related papers (2021-12-31T11:56:01Z)
Meta-Learning Sparse Implicit Neural Representations [69.15490627853629]
Implicit neural representations are a promising new avenue of representing general signals. Current approach is difficult to scale for a large number of signals or a data set. We show that meta-learned sparse neural representations achieve a much smaller loss than dense meta-learned models.
arXiv Detail & Related papers (2021-10-27T18:02:53Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.