Progressive distillation diffusion for raw music generation
- URL: http://arxiv.org/abs/2307.10994v1
- Date: Thu, 20 Jul 2023 16:25:00 GMT
- Title: Progressive distillation diffusion for raw music generation
- Authors: Svetlana Pavlova
- Abstract summary: This paper aims to apply a new deep learning approach to the task of generating raw audio files.
It is based on diffusion models, a recent type of deep generative model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper aims to apply a new deep learning approach to the task of
generating raw audio files. It is based on diffusion models, a recent type of
deep generative model. This new type of method has recently shown outstanding
results with image generation. A lot of focus has been given to those models by
the computer vision community. On the other hand, really few have been given
for other types of applications such as music generation in waveform domain.
In this paper the model for unconditional generating applied to music is
implemented: Progressive distillation diffusion with 1D U-Net. Then, a
comparison of different parameters of diffusion and their value in a full
result is presented. One big advantage of the methods implemented through this
work is the fact that the model is able to deal with progressing audio
processing and generating , using transformation from 1-channel 128 x 384 to
3-channel 128 x 128 mel-spectrograms and looped generation. The empirical
comparisons are realized across different self-collected datasets.
Related papers
- SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models [21.669044026456557]
We propose a method to enable audio-conditioning in large scale image diffusion models.
In addition to audio conditioned image generation, our method can also be utilized in conjuction with diffusion based editing methods.
arXiv Detail & Related papers (2024-05-01T21:43:57Z) - Fast Diffusion GAN Model for Symbolic Music Generation Controlled by
Emotions [1.6004393678882072]
We propose a diffusion model combined with a Generative Adversarial Network to generate discrete symbolic music.
We first used a trained Variational Autoencoder to obtain embeddings of a symbolic music dataset with emotion labels.
Our results demonstrate the successful control of our diffusion model to generate symbolic music with a desired emotion.
arXiv Detail & Related papers (2023-10-21T15:35:43Z) - From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion [84.138804145918]
Deep generative models can generate high-fidelity audio conditioned on various types of representations.
These models are prone to generate audible artifacts when the conditioning is flawed or imperfect.
We propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality from low-bitrate discrete representations.
arXiv Detail & Related papers (2023-08-02T22:14:29Z) - Consistency Models [89.68380014789861]
We propose a new family of models that generate high quality samples by directly mapping noise to data.
They support fast one-step generation by design, while still allowing multistep sampling to trade compute for sample quality.
They also support zero-shot data editing, such as image inpainting, colorization, and super-resolution, without requiring explicit training.
arXiv Detail & Related papers (2023-03-02T18:30:16Z) - ArchiSound: Audio Generation with Diffusion [0.0]
In this work, we investigate the potential of diffusion models for audio generation.
We propose a new method for text-conditional latent audio diffusion with stacked 1D U-Nets.
For each model, we make an effort to maintain reasonable inference speed, targeting real-time on a single consumer GPU.
arXiv Detail & Related papers (2023-01-30T20:23:26Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise [52.59444045853966]
We show that an entire family of generative models can be constructed by varying the choice of image degradation.
The success of fully deterministic models calls into question the community's understanding of diffusion models.
arXiv Detail & Related papers (2022-08-19T15:18:39Z) - Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image
Synthesis [39.671396431940224]
diffusion models have shown remarkable results in image synthesis by gradually removing noise and amplifying signals.
We propose a novel generative process that synthesizes images in a coarse-to-fine manner.
Experiments show that the proposed model outperforms the previous method in FID on LSUN bedroom and church datasets.
arXiv Detail & Related papers (2022-07-16T15:00:21Z) - Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models.
A major drawback of this method is that it requires hundreds of iterations to produce a competitive result.
Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z) - WaveGrad: Estimating Gradients for Waveform Generation [55.405580817560754]
WaveGrad is a conditional model for waveform generation which estimates gradients of the data density.
It starts from a Gaussian white noise signal and iteratively refines the signal via a gradient-based sampler conditioned on the mel-spectrogram.
We find that it can generate high fidelity audio samples using as few as six iterations.
arXiv Detail & Related papers (2020-09-02T17:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.