ADNAC: Audio Denoiser using Neural Audio Codec
- URL: http://arxiv.org/abs/2511.01773v1
- Date: Mon, 03 Nov 2025 17:28:28 GMT
- Title: ADNAC: Audio Denoiser using Neural Audio Codec
- Authors: Daniel Jimon, Mircea Vaida, Adriana Stan,
- Abstract summary: This paper presents a proof-of-concept for adapting the Descript Audio Codec (DAC) for music denoising.<n>This work overcomes the limitations of traditional neural architectures like U-Nets by training the model on a large-scale, custom-synthesized dataset.<n>Ultimately, this paper aims to present a PoC for high-fidelity, generative audio restoration.
- Score: 2.752817022620644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Audio denoising is critical in signal processing, enhancing intelligibility and fidelity for applications like restoring musical recordings. This paper presents a proof-of-concept for adapting a state-of-the-art neural audio codec, the Descript Audio Codec (DAC), for music denoising. This work overcomes the limitations of traditional architectures like U-Nets by training the model on a large-scale, custom-synthesized dataset built from diverse sources. Training is guided by a multi objective loss function that combines time-domain, spectral, and signal-level fidelity metrics. Ultimately, this paper aims to present a PoC for high-fidelity, generative audio restoration.
Related papers
- Towards Generalized Source Tracing for Codec-Based Deepfake Speech [52.68106957822706]
We introduce the Semantic-Acoustic Source Tracing Network (SASTNet), which jointly leverages Whisper for semantic feature encoding and Wav2vec2 with AudioMAE for acoustic feature encoding.<n>Our proposed SASTNet achieves state-of-the-art performance on the CoSG test set of the CodecFake+ dataset, demonstrating its effectiveness for reliable source tracing.
arXiv Detail & Related papers (2025-06-08T21:36:10Z) - BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models [62.38713281234756]
Binaural rendering pipeline aims to synthesize audio that mimics natural hearing based on a mono audio.<n>Many methods have been proposed to solve this problem, but they struggle with rendering quality and streamable inference.<n>We propose a flow matching based streaming speech framework called BinauralFlow synthesis framework.
arXiv Detail & Related papers (2025-05-28T20:59:15Z) - High-Fidelity Music Vocoder using Neural Audio Codecs [18.95453617434051]
DisCoder is a neural vocoder that reconstructs high-fidelity 44.1 kHz audio from mel spectrograms.<n>DisCoder achieves state-of-the-art performance in music synthesis on several objective metrics and in a MUSHRA listening study.<n>Our approach also shows competitive performance in speech synthesis, highlighting its potential as a universal vocoder.
arXiv Detail & Related papers (2025-02-18T11:25:46Z) - SNAC: Multi-Scale Neural Audio Codec [1.0753191494611891]
Multi-Scale Neural Audio Codec is a simple extension of RVQ where the quantizers can operate at different temporal resolutions.
This paper proposes Multi-Scale Neural Audio Codec, a simple extension of RVQ where the quantizers can operate at different temporal resolutions.
arXiv Detail & Related papers (2024-10-18T12:24:05Z) - Learning Source Disentanglement in Neural Audio Codec [20.335701584949526]
We introduce the Source-Disentangled Neural Audio Codec (SD-Codec), a novel approach that combines audio coding and source separation.<n>By jointly learning audio resynthesis and separation, SD-Codec explicitly assigns audio signals from different domains to distinct codebooks, sets of discrete representations.<n> Experimental results indicate that SD-Codec not only maintains competitive resynthesis quality but also, supported by the separation results, demonstrates successful disentanglement of different sources in the latent space.
arXiv Detail & Related papers (2024-09-17T14:21:02Z) - WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling [63.8735398698683]
A crucial component of language models is the tokenizer, which compresses high-dimensional natural signals into lower-dimensional discrete tokens.<n>We introduce WavTokenizer, which offers several advantages over previous SOTA acoustic models in the audio domain.<n>WavTokenizer achieves state-of-the-art reconstruction quality with outstanding UTMOS scores and inherently contains richer semantic information.
arXiv Detail & Related papers (2024-08-29T13:43:36Z) - Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
Models [65.18102159618631]
multimodal generative modeling has created milestones in text-to-image and text-to-video generation.
Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio pairs, and the complexity of modeling long continuous audio data.
We propose Make-An-Audio with a prompt-enhanced diffusion model that addresses these gaps.
arXiv Detail & Related papers (2023-01-30T04:44:34Z) - High Fidelity Neural Audio Compression [92.4812002532009]
We introduce a state-of-the-art real-time, high-fidelity, audio leveraging neural networks.
It consists in a streaming encoder-decoder architecture with quantized latent space trained in an end-to-end fashion.
We simplify and speed-up the training by using a single multiscale spectrogram adversary.
arXiv Detail & Related papers (2022-10-24T17:52:02Z) - Hierarchical Timbre-Painting and Articulation Generation [92.59388372914265]
We present a fast and high-fidelity method for music generation, based on specified f0 and loudness.
The synthesized audio mimics the timbre and articulation of a target instrument.
arXiv Detail & Related papers (2020-08-30T05:27:39Z) - Audio Dequantization for High Fidelity Audio Generation in Flow-based
Neural Vocoder [29.63675159839434]
Flow-based neural vocoder has shown significant improvement in real-time speech generation task.
We propose audio dequantization methods in flow-based neural vocoder for high fidelity audio generation.
arXiv Detail & Related papers (2020-08-16T09:37:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.