Impact of time and note duration tokenizations on deep learning symbolic
music modeling
- URL: http://arxiv.org/abs/2310.08497v1
- Date: Thu, 12 Oct 2023 16:56:37 GMT
- Title: Impact of time and note duration tokenizations on deep learning symbolic
music modeling
- Authors: Nathan Fradet, Nicolas Gutowski, Fabien Chhel, Jean-Pierre Briot
- Abstract summary: We analyze the common tokenization methods and experiment with time and note duration representations.
We demonstrate that explicit information leads to better results depending on the task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symbolic music is widely used in various deep learning tasks, including
generation, transcription, synthesis, and Music Information Retrieval (MIR). It
is mostly employed with discrete models like Transformers, which require music
to be tokenized, i.e., formatted into sequences of distinct elements called
tokens. Tokenization can be performed in different ways. As Transformer can
struggle at reasoning, but capture more easily explicit information, it is
important to study how the way the information is represented for such model
impact their performances. In this work, we analyze the common tokenization
methods and experiment with time and note duration representations. We compare
the performances of these two impactful criteria on several tasks, including
composer and emotion classification, music generation, and sequence
representation learning. We demonstrate that explicit information leads to
better results depending on the task.
Related papers
- MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - From Words to Music: A Study of Subword Tokenization Techniques in
Symbolic Music Generation [1.9188864062289432]
Subword tokenization has been widely successful in text-based natural language processing tasks with Transformer-based models.
We apply subword tokenization on post-musical tokenization schemes and find that it enables the generation of longer songs at the same time.
Our study suggests that subword tokenization is a promising technique for symbolic music generation and may have broader implications for music composition.
arXiv Detail & Related papers (2023-04-18T12:46:12Z) - Contrastive Learning with Positive-Negative Frame Mask for Music
Representation [91.44187939465948]
This paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR.
We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music.
arXiv Detail & Related papers (2022-03-17T07:11:42Z) - A Novel Multi-Task Learning Method for Symbolic Music Emotion
Recognition [76.65908232134203]
Symbolic Music Emotion Recognition(SMER) is to predict music emotion from symbolic data, such as MIDI and MusicXML.
In this paper, we present a simple multi-task framework for SMER, which incorporates the emotion recognition task with other emotion-related auxiliary tasks.
arXiv Detail & Related papers (2022-01-15T07:45:10Z) - Score Transformer: Generating Musical Score from Note-level
Representation [2.3554584457413483]
We train the Transformer model to transcribe note-level representation into appropriate music notation.
We also explore an effective notation-level token representation to work with the model.
arXiv Detail & Related papers (2021-12-01T09:08:01Z) - Towards Cross-Cultural Analysis using Music Information Dynamics [7.4517333921953215]
Music from different cultures establish different aesthetics by having different style conventions on two aspects.
We propose a framework that could be used to quantitatively compare music from different cultures by looking at these two aspects.
arXiv Detail & Related papers (2021-11-24T16:05:29Z) - Multi-task Learning with Metadata for Music Mood Classification [0.0]
Mood recognition is an important problem in music informatics and has key applications in music discovery and recommendation.
We propose a multi-task learning approach in which a shared model is simultaneously trained for mood and metadata prediction tasks.
Applying our technique on the existing state-of-the-art convolutional neural networks for mood classification improves their performances consistently.
arXiv Detail & Related papers (2021-10-10T11:36:34Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z) - Learning Style-Aware Symbolic Music Representations by Adversarial
Autoencoders [9.923470453197657]
We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information.
We introduce the first Music Adversarial Autoencoder (MusAE)
Our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders.
arXiv Detail & Related papers (2020-01-15T18:07:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.