Related papers: Pop2Piano : Pop Audio-based Piano Cover Generation

Pop2Piano : Pop Audio-based Piano Cover Generation

URL: http://arxiv.org/abs/2211.00895v2
Date: Sat, 1 Apr 2023 06:02:16 GMT
Title: Pop2Piano : Pop Audio-based Piano Cover Generation
Authors: Jongho Choi, Kyogu Lee
Abstract summary: We present Pop2Piano, a Transformer network that generates piano covers given waveforms of pop music. To the best of our knowledge, this is the first model to generate a piano cover directly from pop audio without using melody and chord extraction modules.
Score: 14.901465561297178
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Piano covers of pop music are enjoyed by many people. However, the task of automatically generating piano covers of pop music is still understudied. This is partly due to the lack of synchronized {Pop, Piano Cover} data pairs, which made it challenging to apply the latest data-intensive deep learning-based methods. To leverage the power of the data-driven approach, we make a large amount of paired and synchronized {Pop, Piano Cover} data using an automated pipeline. In this paper, we present Pop2Piano, a Transformer network that generates piano covers given waveforms of pop music. To the best of our knowledge, this is the first model to generate a piano cover directly from pop audio without using melody and chord extraction modules. We show that Pop2Piano, trained with our dataset, is capable of producing plausible piano covers.

Related papers

PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text [8.382511298208003]
PIAST (PIano dataset with Audio, Symbolic, and Text) is a piano music dataset. We collected 9,673 tracks from YouTube and added human annotations for 2,023 tracks by music experts. Both include audio, text, tag annotations, and transcribed MIDI utilizing state-of-the-art piano transcription and beat tracking models.
arXiv Detail & Related papers (2024-11-04T19:34:13Z)
PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations [21.52466727496551]
We introduce PianoMime, a framework for training a piano-playing agent using internet demonstrations. In our work, we leverage these demonstrations to learn a generalist piano-playing agent capable of playing any arbitrary song. We show that we are able to learn a policy with up to 56% F1 score on unseen songs.
arXiv Detail & Related papers (2024-07-25T16:37:07Z)
PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training [8.484581633133542]
PianoBART is a pre-trained model that uses BART for both symbolic piano music generation and understanding. We devise a multi-level object selection strategy for different pre-training tasks of PianoBART, which can prevent information leakage or loss. Experiments demonstrate that PianoBART efficiently learns musical patterns and achieves outstanding performance in generating high-quality coherent pieces.
arXiv Detail & Related papers (2024-06-26T03:35:54Z)
LARP: Language Audio Relational Pre-training for Cold-Start Playlist Continuation [49.89372182441713]
We introduce LARP, a multi-modal cold-start playlist continuation model. Our framework uses increasing stages of task-specific abstraction: within-track (language-audio) contrastive loss, track-track contrastive loss, and track-playlist contrastive loss.
arXiv Detail & Related papers (2024-06-20T14:02:15Z)
Modeling Bends in Popular Music Guitar Tablatures [49.64902130083662]
Tablature notation is widely used in popular music to transcribe and share guitar musical content. This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard. Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions.
arXiv Detail & Related papers (2023-08-22T07:50:58Z)
At Your Fingertips: Extracting Piano Fingering Instructions from Videos [45.643494669796866]
We consider the AI task of automating the extraction of fingering information from videos. We show how to perform this task with high-accuracy using a combination of deep-learning modules. We run the resulting system on 90 videos, resulting in high-quality piano fingering information of 150K notes.
arXiv Detail & Related papers (2023-03-07T09:09:13Z)
Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles. To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio. We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z)
The Piano Inpainting Application [0.0]
generative algorithms are still not widely used by artists due to the limited control they offer, prohibitive inference times or the lack of integration within musicians' generate. In this work, we present the Piano Inpainting Application (PIA), a generative model focused on inpainting piano performances.
arXiv Detail & Related papers (2021-07-13T09:33:11Z)
Towards Learning to Play Piano with Dexterous Hands and Touch [79.48656721563795]
We demonstrate how an agent can learn directly from machine-readable music score to play the piano with dexterous hands on a simulated piano. We achieve this by using a touch-augmented reward and a novel curriculum of tasks.
arXiv Detail & Related papers (2021-06-03T17:59:31Z)
PopMAG: Pop Music Accompaniment Generation [190.09996798215738]
We propose a novel MUlti-track MIDI representation (MuMIDI) which enables simultaneous multi-track generation in a single sequence. MuMIDI enlarges the sequence length and brings the new challenge of long-term music modeling. We call our system for pop music accompaniment generation as PopMAG.
arXiv Detail & Related papers (2020-08-18T02:28:36Z)
POP909: A Pop-song Dataset for Music Arrangement Generation [10.0454303747519]
We propose POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians. The main body of the dataset contains the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format, which are aligned to the original audio files. We provide the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms.
arXiv Detail & Related papers (2020-08-17T08:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.