Generation of lyrics lines conditioned on music audio clips
- URL: http://arxiv.org/abs/2009.14375v1
- Date: Wed, 30 Sep 2020 01:22:58 GMT
- Title: Generation of lyrics lines conditioned on music audio clips
- Authors: Olga Vechtomova, Gaurav Sahu, Dhruv Kumar
- Abstract summary: A bimodal neural network model learns to generate lines conditioned on any given short audio clip.
The system is intended to serve as a creativity tool for songwriters.
- Score: 13.23722670386104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a system for generating novel lyrics lines conditioned on music
audio. A bimodal neural network model learns to generate lines conditioned on
any given short audio clip. The model consists of a spectrogram variational
autoencoder (VAE) and a text VAE. Both automatic and human evaluations
demonstrate effectiveness of our model in generating lines that have an
emotional impact matching a given audio clip. The system is intended to serve
as a creativity tool for songwriters.
Related papers
- Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - Noise2Music: Text-conditioned Music Generation with Diffusion Models [73.74580231353684]
We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts.
We find that the generated audio is not only able to faithfully reflect key elements of the text prompt such as genre, tempo, instruments, mood, and era.
Pretrained large language models play a key role in this story -- they are used to generate paired text for the audio of the training set and to extract embeddings of the text prompts ingested by the diffusion models.
arXiv Detail & Related papers (2023-02-08T07:27:27Z) - Youling: an AI-Assisted Lyrics Creation System [72.00418962906083]
This paper demonstrates textitYouling, an AI-assisted lyrics creation system, designed to collaborate with music creators.
In the lyrics generation process, textitYouling supports traditional one pass full-text generation mode as well as an interactive generation mode.
The system also provides a revision module which enables users to revise undesired sentences or words of lyrics repeatedly.
arXiv Detail & Related papers (2022-01-18T03:57:04Z) - A Melody-Unsupervision Model for Singing Voice Synthesis [9.137554315375919]
We propose a melody-unsupervision model that requires only audio-and-lyrics pairs without temporal alignment in training time.
We show that the proposed model is capable of being trained with speech audio and text labels but can generate singing voice in inference time.
arXiv Detail & Related papers (2021-10-13T07:42:35Z) - LyricJam: A system for generating lyrics for live instrumental music [11.521519161773288]
We describe a real-time system that receives a live audio stream from a jam session and generates lyric lines that are congruent with the live music being played.
Two novel approaches are proposed to align the learned latent spaces of audio and text representations.
arXiv Detail & Related papers (2021-06-03T16:06:46Z) - Automatic Neural Lyrics and Melody Composition [6.574381538711984]
The proposed system, Automatic Neural Lyrics and Melody Composition (AutoNLMC) is an attempt to make the whole process of songwriting automatic using artificial neural networks.
Our lyric to vector (lyric2vec) model trained on a large set of lyric-melody pairs dataset parsed at syllable, word and sentence levels are large scale embedding models.
It can also take lyrics from professional lyric writer to generate matching melodies.
arXiv Detail & Related papers (2020-11-12T13:44:01Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - Automatic Lyrics Transcription using Dilated Convolutional Neural
Networks with Self-Attention [11.232541198648159]
We have trained convolutional time-delay neural networks with self-attention on monophonic karaoke recordings.
Our system achieves notable improvement to the state-of-the-art in automatic lyrics transcription.
arXiv Detail & Related papers (2020-07-13T16:36:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.