Controllable Lyrics-to-Melody Generation
- URL: http://arxiv.org/abs/2306.02613v1
- Date: Mon, 5 Jun 2023 06:14:08 GMT
- Title: Controllable Lyrics-to-Melody Generation
- Authors: Zhe Zhang, Yi Yu, Atsuhiro Takasu
- Abstract summary: We propose a controllable lyrics-to-melody generation network, ConL2M, which is able to generate realistic melodies from lyrics in user-desired musical style.
Our work contains three main novelties: 1) To model the dependencies of music attributes cross multiple sequences, inter-branch memory fusion (Memofu) is proposed to enable information flow between multi-branch stacked LSTM architecture; 2) Reference style embedding (RSE) is proposed to improve the quality of generation as well as control the musical style of generated melodies; 3) Sequence-level statistical loss (SeqLoss) is proposed to help the model learn sequence-level
- Score: 14.15838552524433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lyrics-to-melody generation is an interesting and challenging topic in AI
music research field. Due to the difficulty of learning the correlations
between lyrics and melody, previous methods suffer from low generation quality
and lack of controllability. Controllability of generative models enables human
interaction with models to generate desired contents, which is especially
important in music generation tasks towards human-centered AI that can
facilitate musicians in creative activities. To address these issues, we
propose a controllable lyrics-to-melody generation network, ConL2M, which is
able to generate realistic melodies from lyrics in user-desired musical style.
Our work contains three main novelties: 1) To model the dependencies of music
attributes cross multiple sequences, inter-branch memory fusion (Memofu) is
proposed to enable information flow between multi-branch stacked LSTM
architecture; 2) Reference style embedding (RSE) is proposed to improve the
quality of generation as well as control the musical style of generated
melodies; 3) Sequence-level statistical loss (SeqLoss) is proposed to help the
model learn sequence-level features of melodies given lyrics. Verified by
evaluation metrics for music quality and controllability, initial study of
controllable lyrics-to-melody generation shows better generation quality and
the feasibility of interacting with users to generate the melodies in desired
musical styles when given lyrics.
Related papers
- SongCreator: Lyrics-based Universal Song Generation [53.248473603201916]
SongCreator is a song-generation system designed to tackle the challenge of generating songs with both vocals and accompaniment given lyrics.
The model features two novel designs: a meticulously designed dual-sequence language model (M) to capture the information of vocals and accompaniment for song generation, and a series of attention mask strategies for DSLM.
Experiments demonstrate the effectiveness of SongCreator by achieving state-of-the-art or competitive performances on all eight tasks.
arXiv Detail & Related papers (2024-09-09T19:37:07Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z) - Interpretable Melody Generation from Lyrics with Discrete-Valued
Adversarial Training [12.02541352832997]
Gumbel-Softmax is exploited to solve the non-differentiability problem of generating music attributes by Generative Adversarial Networks (GANs)
Users can listen to the generated AI song as well as recreate a new song by selecting from recommended music attributes.
arXiv Detail & Related papers (2022-06-30T05:45:47Z) - SongMASS: Automatic Song Writing with Pre-training and Alignment
Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation.
It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling.
We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.