Related papers: POP909: A Pop-song Dataset for Music Arrangement Generation

POP909: A Pop-song Dataset for Music Arrangement Generation

URL: http://arxiv.org/abs/2008.07142v1
Date: Mon, 17 Aug 2020 08:08:14 GMT
Title: POP909: A Pop-song Dataset for Music Arrangement Generation
Authors: Ziyu Wang, Ke Chen, Junyan Jiang, Yiyi Zhang, Maoran Xu, Shuqi Dai, Xianbin Gu, Gus Xia
Abstract summary: We propose POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians. The main body of the dataset contains the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format, which are aligned to the original audio files. We provide the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms.
Score: 10.0454303747519
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Music arrangement generation is a subtask of automatic music generation, which involves reconstructing and re-conceptualizing a piece with new compositional techniques. Such a generation process inevitably requires reference from the original melody, chord progression, or other structural information. Despite some promising models for arrangement, they lack more refined data to achieve better evaluations and more practical results. In this paper, we propose POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians. The main body of the dataset contains the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format, which are aligned to the original audio files. Furthermore, we provide the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms. Finally, we conduct several baseline experiments with this dataset using standard deep music generation algorithms.

Related papers

PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text [8.382511298208003]
PIAST (PIano dataset with Audio, Symbolic, and Text) is a piano music dataset. We collected 9,673 tracks from YouTube and added human annotations for 2,023 tracks by music experts. Both include audio, text, tag annotations, and transcribed MIDI utilizing state-of-the-art piano transcription and beat tracking models.
arXiv Detail & Related papers (2024-11-04T19:34:13Z)
MidiCaps: A large-scale MIDI dataset with text captions [6.806050368211496]
This work aims to enable research that combines LLMs with symbolic music by presenting, the first openly available large-scale MIDI dataset with text captions. Inspired by recent advancements in captioning techniques, we present a curated dataset of over 168k MIDI files with textual descriptions.
arXiv Detail & Related papers (2024-06-04T12:21:55Z)
MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music. To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation) Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z)
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models [42.2977676825086]
In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. Our framework fortifies the U-Net with multi-scale aggregation in order to maintain consistency before and after editing. Our proposed method significantly surpasses preceding systems in music quality, text relevance and harmony.
arXiv Detail & Related papers (2023-08-28T07:11:42Z)
Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z)
Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints. Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z)
Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles. To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio. We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z)
ComMU: Dataset for Combinatorial Music Generation [20.762884001498627]
Combinatorial music generation creates short samples of music with rich musical metadata, and combines them to produce a complete music. ComMU is the first symbolic music dataset consisting of short music samples and their corresponding 12 musical metadata. Our results show that we can generate diverse high-quality music only with metadata, and that our unique metadata such as track-role and extended chord quality improves the capacity of the automatic composition.
arXiv Detail & Related papers (2022-11-17T07:25:09Z)
Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation. ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z)
Controllable deep melody generation via hierarchical music structure representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody. To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks. To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z)
PopMAG: Pop Music Accompaniment Generation [190.09996798215738]
We propose a novel MUlti-track MIDI representation (MuMIDI) which enables simultaneous multi-track generation in a single sequence. MuMIDI enlarges the sequence length and brings the new challenge of long-term music modeling. We call our system for pop music accompaniment generation as PopMAG.
arXiv Detail & Related papers (2020-08-18T02:28:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.