MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation
- URL: http://arxiv.org/abs/2309.10738v2
- Date: Wed, 20 Sep 2023 10:56:07 GMT
- Title: MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation
- Authors: Xinda Wu, Zhijie Huang, Kejun Zhang, Jiaxing Yu, Xu Tan, Tieyao Zhang,
Zihao Wang, Lingyun Sun
- Abstract summary: MelodyGLM is a multi-task pre-training framework for generating melodies with long-term structure.
We have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces.
- Score: 39.892059799407434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models have achieved impressive results in various music
understanding and generation tasks. However, existing pre-training methods for
symbolic melody generation struggle to capture multi-scale, multi-dimensional
structural information in note sequences, due to the domain knowledge
discrepancy between text and music. Moreover, the lack of available large-scale
symbolic melody datasets limits the pre-training improvement. In this paper, we
propose MelodyGLM, a multi-task pre-training framework for generating melodies
with long-term structure. We design the melodic n-gram and long span sampling
strategies to create local and global blank infilling tasks for modeling the
local and global structures in melodies. Specifically, we incorporate pitch
n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram
blank infilling tasks for modeling the multi-dimensional structures in
melodies. To this end, we have constructed a large-scale symbolic melody
dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet
is utilized for large-scale pre-training and domain-specific n-gram lexicon
construction. Both subjective and objective evaluations demonstrate that
MelodyGLM surpasses the standard and previous pre-training methods. In
particular, subjective evaluations show that, on the melody continuation task,
MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in
consistency, rhythmicity, structure, and overall quality, respectively.
Notably, MelodyGLM nearly matches the quality of human-composed melodies on the
melody inpainting task.
Related papers
- SongComposer: A Large Language Model for Lyric and Melody Composition in
Song Generation [88.33522730306674]
SongComposer could understand and generate melodies and lyrics in symbolic song representations.
We resort to symbolic song representation, the mature and efficient way humans designed for music.
With extensive experiments, SongComposer demonstrates superior performance in lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation.
arXiv Detail & Related papers (2024-02-27T16:15:28Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - WuYun: Exploring hierarchical skeleton-guided melody generation using
knowledge-enhanced deep learning [26.515527387450636]
WuYun is a knowledge-enhanced deep learning architecture for improving structure of generated melodies.
We use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them.
We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average.
arXiv Detail & Related papers (2023-01-11T14:33:42Z) - Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z) - Controllable deep melody generation via hierarchical music structure
representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody.
To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks.
To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z) - Hierarchical Recurrent Neural Networks for Conditional Melody Generation
with Long-term Structure [0.0]
We propose a conditional melody generation model based on a hierarchical recurrent neural network.
This model generates melodies with long-term structures based on given chord accompaniments.
Results from our listening test indicate that CM-HRNN outperforms AttentionRNN in terms of long-term structure and overall rating.
arXiv Detail & Related papers (2021-02-19T08:22:26Z) - SongMASS: Automatic Song Writing with Pre-training and Alignment
Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation.
It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling.
We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z) - PopMAG: Pop Music Accompaniment Generation [190.09996798215738]
We propose a novel MUlti-track MIDI representation (MuMIDI) which enables simultaneous multi-track generation in a single sequence.
MuMIDI enlarges the sequence length and brings the new challenge of long-term music modeling.
We call our system for pop music accompaniment generation as PopMAG.
arXiv Detail & Related papers (2020-08-18T02:28:36Z) - Exploring Inherent Properties of the Monophonic Melody of Songs [10.055143995729415]
We propose a set of interpretable features on monophonic melody for computational purposes.
These features are defined not only in mathematical form, but also with some considerations on composers 'intuition.
These features are considered by people universally in many genres of songs, even for atonal composition practices.
arXiv Detail & Related papers (2020-03-20T14:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.