ComposerX: Multi-Agent Symbolic Music Composition with LLMs
- URL: http://arxiv.org/abs/2404.18081v2
- Date: Tue, 30 Apr 2024 14:14:26 GMT
- Title: ComposerX: Multi-Agent Symbolic Music Composition with LLMs
- Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo,
- Abstract summary: Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
- Score: 51.68908082829048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts. To further explore and enhance LLMs' potential in music composition by leveraging their reasoning ability and the large knowledge base in music history and theory, we propose ComposerX, an agent-based symbolic music generation framework. We find that applying a multi-agent approach significantly improves the music composition quality of GPT-4. The results demonstrate that ComposerX is capable of producing coherent polyphonic music compositions with captivating melodies, while adhering to user instructions.
Related papers
- A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding.
We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z) - Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation [31.825105824490464]
Symbolic Music, akin to language, can be encoded in discrete symbols.
Recent research has extended the application of large language models (LLMs) to the symbolic music domain.
This study conducts a thorough investigation of LLMs' capability and limitations in symbolic music processing.
arXiv Detail & Related papers (2024-07-31T11:29:46Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - SongComposer: A Large Language Model for Lyric and Melody Composition in
Song Generation [88.33522730306674]
SongComposer could understand and generate melodies and lyrics in symbolic song representations.
We resort to symbolic song representation, the mature and efficient way humans designed for music.
With extensive experiments, SongComposer demonstrates superior performance in lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation.
arXiv Detail & Related papers (2024-02-27T16:15:28Z) - ChatMusician: Understanding and Generating Music Intrinsically with LLM [81.48629006702409]
ChatMusician is an open-source Large Language Models (LLMs) that integrates intrinsic musical abilities.
It can understand and generate music with a pure text tokenizer without any external multi-modal neural structures or tokenizers.
Our model is capable of composing well-structured, full-length music, conditioned on texts, chords, melodies, motifs, musical forms, etc.
arXiv Detail & Related papers (2024-02-25T17:19:41Z) - ByteComposer: a Human-like Melody Composition Method based on Language
Model Agent [11.792129708566598]
Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks.
We propose ByteComposer, an agent framework emulating a human's creative pipeline in four separate steps.
We conduct extensive experiments on GPT4 and several open-source large language models, which substantiate our framework's effectiveness.
arXiv Detail & Related papers (2024-02-24T04:35:07Z) - Music Composition with Deep Learning: A Review [1.7188280334580197]
We analyze the ability of current Deep Learning models to generate music with creativity.
We compare these models to the music composition process from a theoretical point of view.
arXiv Detail & Related papers (2021-08-27T13:53:53Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Generative Deep Learning for Virtuosic Classical Music: Generative
Adversarial Networks as Renowned Composers [0.0]
Current AI-generated music lacks fundamental principles of good compositional techniques.
We can create a better understanding of what parameters are necessary for a composition nearly indistinguishable from a master composition.
arXiv Detail & Related papers (2021-01-01T05:40:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.