Large-Scale MIDI-based Composer Classification
- URL: http://arxiv.org/abs/2010.14805v1
- Date: Wed, 28 Oct 2020 08:07:55 GMT
- Title: Large-Scale MIDI-based Composer Classification
- Authors: Qiuqiang Kong, Keunwoo Choi, Yuxuan Wang
- Abstract summary: We propose large-scale MIDI based composer classification systems using GiantMIDI-Piano.
We are the first to investigate the composer classification problem with up to 100 composers.
Our system achieves a 10-composer and a 100-composer classification accuracies of 0.648 and 0.385.
- Score: 13.815200249190529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Music classification is a task to classify a music piece into labels such as
genres or composers. We propose large-scale MIDI based composer classification
systems using GiantMIDI-Piano, a transcription-based dataset. We propose to use
piano rolls, onset rolls, and velocity rolls as input representations and use
deep neural networks as classifiers. To our knowledge, we are the first to
investigate the composer classification problem with up to 100 composers. By
using convolutional recurrent neural networks as models, our MIDI based
composer classification system achieves a 10-composer and a 100-composer
classification accuracies of 0.648 and 0.385 (evaluated on 30-second clips) and
0.739 and 0.489 (evaluated on music pieces), respectively. Our MIDI based
composer system outperforms several audio-based baseline classification
systems, indicating the effectiveness of using compact MIDI representations for
composer classification.
Related papers
- Arabic Music Classification and Generation using Deep Learning [1.4721222689583375]
This paper proposes a machine learning approach for classifying classical and new Egyptian music by composer and generating new similar music.
The proposed system utilizes a convolutional neural network (CNN) for classification and a CNN autoencoder for generation.
The model 81.4% accuracy in classifying the music by composer, demonstrating the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-10-25T17:47:08Z) - Music Genre Classification using Large Language Models [50.750620612351284]
This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs) for music genre classification.
The proposed approach splits audio signals into 20 ms chunks and processes them through convolutional feature encoders.
During inference, predictions on individual chunks are aggregated for a final genre classification.
arXiv Detail & Related papers (2024-10-10T19:17:56Z) - Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving [5.572472212662453]
This paper approaches the problem of separating the notes from a quantized symbolic music piece (e.g., a MIDI file) into multiple voices and staves.
We propose an end-to-end system based on graph neural networks that notes that belong to the same chord and connect them with edges if they are part of a voice.
arXiv Detail & Related papers (2024-07-15T14:36:13Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - MeloForm: Generating Melody with Musical Form based on Expert Systems
and Neural Networks [146.59245563763065]
MeloForm is a system that generates melody with musical form using expert systems and neural networks.
It can support various kinds of forms, such as verse and chorus form, rondo form, variational form, sonata form, etc.
arXiv Detail & Related papers (2022-08-30T15:44:15Z) - BERT-like Pre-training for Symbolic Piano Music Classification Tasks [15.02723006489356]
This article presents a benchmark study of symbolic piano music classification using the Bidirectional Representations from Transformers (BERT) approach.
We pre-train two 12-layer Transformer models using the BERT approach and fine-tune them for four downstream classification tasks.
Our evaluation shows that the BERT approach leads to higher classification accuracy than recurrent neural network (RNN)-based baselines.
arXiv Detail & Related papers (2021-07-12T07:03:57Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Deep Composer Classification Using Symbolic Representation [6.656753488329095]
In this study, we train deep neural networks to classify composer on a symbolic domain.
The model takes a two-channel two-dimensional input, which is converted from MIDI recordings and performs a single-label classification.
On the experiments conducted on MAESTRO dataset, we report an F1 value of 0.8333 for the classification of 13classical composers.
arXiv Detail & Related papers (2020-10-02T07:40:44Z) - POP909: A Pop-song Dataset for Music Arrangement Generation [10.0454303747519]
We propose POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians.
The main body of the dataset contains the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format, which are aligned to the original audio files.
We provide the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms.
arXiv Detail & Related papers (2020-08-17T08:08:14Z) - Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments.
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.
We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.