Melon Playlist Dataset: a public dataset for audio-based playlist
generation and music tagging
- URL: http://arxiv.org/abs/2102.00201v1
- Date: Sat, 30 Jan 2021 10:13:10 GMT
- Title: Melon Playlist Dataset: a public dataset for audio-based playlist
generation and music tagging
- Authors: Andres Ferraro, Yuntae Kim, Soohyeon Lee, Biho Kim, Namjun Jo, Semi
Lim, Suyon Lim, Jungtaek Jang, Sehwan Kim, Xavier Serra, Dmitry Bogdanov
- Abstract summary: We present a public dataset of mel-spectrograms for 649,091tracks and 148,826 associated playlists annotated by 30,652 different tags.
All the data is gathered from Melon, a popular Korean streaming service.
The dataset is suitable for music information retrieval tasks, in particular, auto-tagging and automatic playlist continuation.
- Score: 8.658926288789164
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: One of the main limitations in the field of audio signal processing is the
lack of large public datasets with audio representations and high-quality
annotations due to restrictions of copyrighted commercial music. We present
Melon Playlist Dataset, a public dataset of mel-spectrograms for 649,091tracks
and 148,826 associated playlists annotated by 30,652 different tags. All the
data is gathered from Melon, a popular Korean streaming service. The dataset is
suitable for music information retrieval tasks, in particular, auto-tagging and
automatic playlist continuation. Even though the latter can be addressed by
collaborative filtering approaches, audio provides opportunities for research
on track suggestions and building systems resistant to the cold-start problem,
for which we provide a baseline. Moreover, the playlists and the annotations
included in the Melon Playlist Dataset make it suitable for metric learning and
representation learning.
Related papers
- MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing [3.3162176082220975]
We present the MOSA (Music mOtion with Semantic ) dataset, which contains high quality 3-D motion capture data, aligned audio recordings, and note-by-note semantic annotations of pitch, beat, phrase, dynamic, articulation, and harmony for 742 professional music performances by 23 professional musicians.
To our knowledge, this is the largest cross-modal music dataset with note-level annotations to date.
arXiv Detail & Related papers (2024-06-10T15:37:46Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [73.47607237309258]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response [42.73982391253872]
MusiLingo is a novel system for music caption generation and music-related query responses.
We train it on an extensive music caption dataset and fine-tune it with instructional data.
Empirical evaluations demonstrate its competitive performance in generating music captions and composing music-related Q&A pairs.
arXiv Detail & Related papers (2023-09-15T19:31:40Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Music Playlist Title Generation Using Artist Information [4.201869316472344]
We present an encoder-decoder model that generates a playlist title from a sequence of music tracks.
Comparing the track IDs and artist IDs as input sequences, we show that the artist-based approach significantly enhances the performance in terms of word overlap, semantic relevance, and diversity.
arXiv Detail & Related papers (2023-01-14T00:19:39Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - Unaligned Supervision For Automatic Music Transcription in The Wild [1.2183405753834562]
NoteEM is a method for simultaneously training a transcriber and aligning the scores to their corresponding performances.
We report SOTA note-level accuracy of the MAPS dataset, and large favorable margins on cross-dataset evaluations.
arXiv Detail & Related papers (2022-04-28T17:31:43Z) - Automatic Embedding of Stories Into Collections of Independent Media [5.188557858279645]
We look at how machine learning techniques can be used to automatically embed stories into collections of independent media.
We use models that extract the tempo of songs to make a music playlist follow a narrative arc.
arXiv Detail & Related papers (2021-11-03T13:36:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.