SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training
- URL: http://arxiv.org/abs/2412.18107v1
- Date: Tue, 24 Dec 2024 02:30:07 GMT
- Title: SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training
- Authors: Jiaxing Yu, Xinda Wu, Yunfei Xu, Tieyao Zhang, Songruoyao Wu, Le Ma, Kejun Zhang,
- Abstract summary: SongGLM is a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training.
We construct a large-scale lyric-melody paired dataset comprising over 200,000 English song pieces for pre-training and fine-tuning.
- Score: 7.3026780262967685
- License:
- Abstract: Lyric-to-melody generation aims to automatically create melodies based on given lyrics, requiring the capture of complex and subtle correlations between them. However, previous works usually suffer from two main challenges: 1) lyric-melody alignment modeling, which is often simplified to one-syllable/word-to-one-note alignment, while others have the problem of low alignment accuracy; 2) lyric-melody harmony modeling, which usually relies heavily on intermediates or strict rules, limiting model's capabilities and generative diversity. In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies. Specifically, 1) we introduce a unified symbolic song representation for lyrics and melodies with word-level and phrase-level (2D) alignment encoding to capture the lyric-melody alignment; 2) we design a multi-task pre-training framework with hierarchical blank infilling objectives (n-gram, phrase, and long span), and incorporate lyric-melody relationships into the extraction of harmonized n-grams to ensure the lyric-melody harmony. We also construct a large-scale lyric-melody paired dataset comprising over 200,000 English song pieces for pre-training and fine-tuning. The objective and subjective results indicate that SongGLM can generate melodies from lyrics with significant improvements in both alignment and harmony, outperforming all the previous baseline methods.
Related papers
- REFFLY: Melody-Constrained Lyrics Editing Model [50.03960548399128]
We introduce REFFLY, the first revision framework designed to edit arbitrary forms of plain text draft into high-quality, full-fledged song lyrics.
Our approach ensures that the generated lyrics retain the original meaning of the draft, align with the melody, and adhere to the desired song structures.
arXiv Detail & Related papers (2024-08-30T23:22:34Z) - MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation [39.892059799407434]
MelodyGLM is a multi-task pre-training framework for generating melodies with long-term structure.
We have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces.
arXiv Detail & Related papers (2023-09-19T16:34:24Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z) - TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage
Method [92.36505210982648]
TeleMelody is a two-stage lyric-to-melody generation system with music template.
It generates melodies with higher quality, better controllability, and less requirement on paired lyric-melody data.
arXiv Detail & Related papers (2021-09-20T15:19:33Z) - SongMASS: Automatic Song Writing with Pre-training and Alignment
Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation.
It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling.
We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.