Siamese SIREN: Audio Compression with Implicit Neural Representations
- URL: http://arxiv.org/abs/2306.12957v1
- Date: Thu, 22 Jun 2023 15:16:06 GMT
- Title: Siamese SIREN: Audio Compression with Implicit Neural Representations
- Authors: Luca A. Lanzend\"orfer, Roger Wattenhofer
- Abstract summary: Implicit Neural Representations (INRs) have emerged as a promising method for representing diverse data modalities.
We present a preliminary investigation into the use of INRs for audio compression.
Our study introduces Siamese SIREN, a novel approach based on the popular SIREN architecture.
- Score: 10.482805367361818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Implicit Neural Representations (INRs) have emerged as a promising method for
representing diverse data modalities, including 3D shapes, images, and audio.
While recent research has demonstrated successful applications of INRs in image
and 3D shape compression, their potential for audio compression remains largely
unexplored. Motivated by this, we present a preliminary investigation into the
use of INRs for audio compression. Our study introduces Siamese SIREN, a novel
approach based on the popular SIREN architecture. Our experimental results
indicate that Siamese SIREN achieves superior audio reconstruction fidelity
while utilizing fewer network parameters compared to previous INR
architectures.
Related papers
- Predicting the Encoding Error of SIRENs [4.673285689826945]
Implicit Neural Representations (INRs) encode signals such as images, videos, and 3D shapes in the weights of neural networks.
We present a method which predicts the encoding error that a popular INR network (SIREN) will reach.
arXiv Detail & Related papers (2024-10-29T01:19:22Z) - Streaming Neural Images [56.41827271721955]
Implicit Neural Representations (INRs) are a novel paradigm for signal representation that have attracted considerable interest for image compression.
In this work, we explore the critical yet overlooked limiting factors of INRs, such as computational cost, unstable performance, and robustness.
arXiv Detail & Related papers (2024-09-25T17:51:20Z) - AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis [62.33446681243413]
view acoustic synthesis aims to render audio at any target viewpoint, given a mono audio emitted by a sound source at a 3D scene.
Existing methods have proposed NeRF-based implicit models to exploit visual cues as a condition for synthesizing audio.
We propose a novel Audio-Visual Gaussian Splatting (AV-GS) model to characterize the entire scene environment.
Experiments validate the superiority of our AV-GS over existing alternatives on the real-world RWAS and simulation-based SoundSpaces datasets.
arXiv Detail & Related papers (2024-06-13T08:34:12Z) - Hypernetworks build Implicit Neural Representations of Sounds [18.28957270390735]
Implicit Neural Representations (INRs) are nowadays used to represent multimedia signals across various real-life applications, including image super-resolution, image compression, or 3D rendering.
Existing methods that leverage INRs are predominantly focused on visual data, as their application to other modalities, such as audio, is nontrivial due to the inductive biases present in architectural attributes of image-based INR models.
We introduce HyperSound, the first meta-learning approach to produce INRs for audio samples that leverages hypernetworks to generalize beyond samples observed in training.
Our approach reconstructs audio samples with quality comparable to other state
arXiv Detail & Related papers (2023-02-09T22:24:26Z) - Modality-Agnostic Variational Compression of Implicit Neural
Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR)
Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism.
After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z) - HyperSound: Generating Implicit Neural Representations of Audio Signals
with Hypernetworks [23.390919506056502]
Implicit neural representations (INRs) are a rapidly growing research field, which provides alternative ways to represent multimedia signals.
We propose HyperSound, a meta-learning method leveraging hypernetworks to produce INRs for audio signals unseen at training time.
We show that our approach can reconstruct sound waves with quality comparable to other state-of-the-art models.
arXiv Detail & Related papers (2022-11-03T14:20:32Z) - NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction [64.36535692191343]
Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems.
This paper addresses two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of hand-crafting one.
Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
arXiv Detail & Related papers (2022-07-22T10:05:36Z) - Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis [148.16279746287452]
We propose a swin-conv block to incorporate the local modeling ability of residual convolutional layer and non-local modeling ability of swin transformer block.
For the training data synthesis, we design a practical noise degradation model which takes into consideration different kinds of noise.
Experiments on AGWN removal and real image denoising demonstrate that the new network architecture design achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-03-24T18:11:31Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.