Large-Scale MIDI-based Composer Classification
- URL: http://arxiv.org/abs/2010.14805v1
- Date: Wed, 28 Oct 2020 08:07:55 GMT
- Title: Large-Scale MIDI-based Composer Classification
- Authors: Qiuqiang Kong, Keunwoo Choi, Yuxuan Wang
- Abstract summary: We propose large-scale MIDI based composer classification systems using GiantMIDI-Piano.
We are the first to investigate the composer classification problem with up to 100 composers.
Our system achieves a 10-composer and a 100-composer classification accuracies of 0.648 and 0.385.
- Score: 13.815200249190529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Music classification is a task to classify a music piece into labels such as
genres or composers. We propose large-scale MIDI based composer classification
systems using GiantMIDI-Piano, a transcription-based dataset. We propose to use
piano rolls, onset rolls, and velocity rolls as input representations and use
deep neural networks as classifiers. To our knowledge, we are the first to
investigate the composer classification problem with up to 100 composers. By
using convolutional recurrent neural networks as models, our MIDI based
composer classification system achieves a 10-composer and a 100-composer
classification accuracies of 0.648 and 0.385 (evaluated on 30-second clips) and
0.739 and 0.489 (evaluated on music pieces), respectively. Our MIDI based
composer system outperforms several audio-based baseline classification
systems, indicating the effectiveness of using compact MIDI representations for
composer classification.
Related papers
- RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - Comparision Of Adversarial And Non-Adversarial LSTM Music Generative
Models [2.569647910019739]
This work implements and compares adversarial and non-adversarial training of recurrent neural network music composers on MIDI data.
The evaluation indicates that adversarial training produces more aesthetically pleasing music.
arXiv Detail & Related papers (2022-11-01T20:23:49Z) - MeloForm: Generating Melody with Musical Form based on Expert Systems
and Neural Networks [146.59245563763065]
MeloForm is a system that generates melody with musical form using expert systems and neural networks.
It can support various kinds of forms, such as verse and chorus form, rondo form, variational form, sonata form, etc.
arXiv Detail & Related papers (2022-08-30T15:44:15Z) - BERT-like Pre-training for Symbolic Piano Music Classification Tasks [15.02723006489356]
This article presents a benchmark study of symbolic piano music classification using the Bidirectional Representations from Transformers (BERT) approach.
We pre-train two 12-layer Transformer models using the BERT approach and fine-tune them for four downstream classification tasks.
Our evaluation shows that the BERT approach leads to higher classification accuracy than recurrent neural network (RNN)-based baselines.
arXiv Detail & Related papers (2021-07-12T07:03:57Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation [0.0]
The aim is to obtain a model that can suggest the probability a MIDI clip might be composed condition on the auto-generation hypothesis.
The experiment results show our model ranks $3rd$ in all the $7$ teams in the data challenge in CSMT( 2020)
arXiv Detail & Related papers (2020-10-15T13:59:58Z) - Deep Composer Classification Using Symbolic Representation [6.656753488329095]
In this study, we train deep neural networks to classify composer on a symbolic domain.
The model takes a two-channel two-dimensional input, which is converted from MIDI recordings and performs a single-label classification.
On the experiments conducted on MAESTRO dataset, we report an F1 value of 0.8333 for the classification of 13classical composers.
arXiv Detail & Related papers (2020-10-02T07:40:44Z) - POP909: A Pop-song Dataset for Music Arrangement Generation [10.0454303747519]
We propose POP909, a dataset which contains multiple versions of the piano arrangements of 909 popular songs created by professional musicians.
The main body of the dataset contains the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format, which are aligned to the original audio files.
We provide the annotations of tempo, beat, key, and chords, where the tempo curves are hand-labeled and others are done by MIR algorithms.
arXiv Detail & Related papers (2020-08-17T08:08:14Z) - Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments.
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.
We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z) - Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach [74.09517278785519]
We introduce a grading function that evaluates four-part chorales in the style of J.S. Bach along important musical features.
We show that the function is both interpretable and outperforms human experts at discriminating Bach chorales from model-generated ones.
arXiv Detail & Related papers (2020-06-23T21:02:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.