SongComposer: A Large Language Model for Lyric and Melody Composition in
Song Generation
- URL: http://arxiv.org/abs/2402.17645v1
- Date: Tue, 27 Feb 2024 16:15:28 GMT
- Title: SongComposer: A Large Language Model for Lyric and Melody Composition in
Song Generation
- Authors: Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Conghui
He, Dahua Lin, Jiaqi Wang
- Abstract summary: SongComposer could understand and generate melodies and lyrics in symbolic song representations.
We resort to symbolic song representation, the mature and efficient way humans designed for music.
With extensive experiments, SongComposer demonstrates superior performance in lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation.
- Score: 88.33522730306674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present SongComposer, an innovative LLM designed for song composition. It
could understand and generate melodies and lyrics in symbolic song
representations, by leveraging the capability of LLM. Existing music-related
LLM treated the music as quantized audio signals, while such implicit encoding
leads to inefficient encoding and poor flexibility. In contrast, we resort to
symbolic song representation, the mature and efficient way humans designed for
music, and enable LLM to explicitly compose songs like humans. In practice, we
design a novel tuple design to format lyric and three note attributes (pitch,
duration, and rest duration) in the melody, which guarantees the correct LLM
understanding of musical symbols and realizes precise alignment between lyrics
and melody. To impart basic music understanding to LLM, we carefully collected
SongCompose-PT, a large-scale song pretraining dataset that includes lyrics,
melodies, and paired lyrics-melodies in either Chinese or English. After
adequate pre-training, 10K carefully crafted QA pairs are used to empower the
LLM with the instruction-following capability and solve diverse tasks. With
extensive experiments, SongComposer demonstrates superior performance in
lyric-to-melody generation, melody-to-lyric generation, song continuation, and
text-to-song creation, outperforming advanced LLMs like GPT-4.
Related papers
- Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation [31.825105824490464]
Symbolic Music, akin to language, can be encoded in discrete symbols.
Recent research has extended the application of large language models (LLMs) to the symbolic music domain.
This study conducts a thorough investigation of LLMs' capability and limitations in symbolic music processing.
arXiv Detail & Related papers (2024-07-31T11:29:46Z) - ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z) - ChatMusician: Understanding and Generating Music Intrinsically with LLM [81.48629006702409]
ChatMusician is an open-source Large Language Models (LLMs) that integrates intrinsic musical abilities.
It can understand and generate music with a pure text tokenizer without any external multi-modal neural structures or tokenizers.
Our model is capable of composing well-structured, full-length music, conditioned on texts, chords, melodies, motifs, musical forms, etc.
arXiv Detail & Related papers (2024-02-25T17:19:41Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - SongMASS: Automatic Song Writing with Pre-training and Alignment
Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation.
It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling.
We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.