miditok: A Python package for MIDI file tokenization
- URL: http://arxiv.org/abs/2310.17202v1
- Date: Thu, 26 Oct 2023 07:37:44 GMT
- Title: miditok: A Python package for MIDI file tokenization
- Authors: Nathan Fradet, Jean-Pierre Briot, Fabien Chhel, Amal El Fallah
Seghrouchni, Nicolas Gutowski
- Abstract summary: MidiTok is an open-source library allowing to tokenize symbolic music.
It features the most popular music tokenizations, under a unified API.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent progress in natural language processing has been adapted to the
symbolic music modality. Language models, such as Transformers, have been used
with symbolic music for a variety of tasks among which music generation,
modeling or transcription, with state-of-the-art performances. These models are
beginning to be used in production products. To encode and decode music for the
backbone model, they need to rely on tokenizers, whose role is to serialize
music into sequences of distinct elements called tokens. MidiTok is an
open-source library allowing to tokenize symbolic music with great flexibility
and extended features. It features the most popular music tokenizations, under
a unified API. It is made to be easily used and extensible for everyone.
Related papers
- MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation [19.878013881045817]
MusiConGen is a temporally-conditioned Transformer-based text-to-music model.
It integrates automatically-extracted rhythm and chords as the condition signal.
We show that MusiConGen can generate realistic backing track music that aligns well with the specified conditions.
arXiv Detail & Related papers (2024-07-21T05:27:53Z) - MidiCaps: A large-scale MIDI dataset with text captions [6.806050368211496]
This work aims to enable research that combines LLMs with symbolic music by presenting, the first openly available large-scale MIDI dataset with text captions.
Inspired by recent advancements in captioning techniques, we present a curated dataset of over 168k MIDI files with textual descriptions.
arXiv Detail & Related papers (2024-06-04T12:21:55Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [73.47607237309258]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - From Words to Music: A Study of Subword Tokenization Techniques in
Symbolic Music Generation [1.9188864062289432]
Subword tokenization has been widely successful in text-based natural language processing tasks with Transformer-based models.
We apply subword tokenization on post-musical tokenization schemes and find that it enables the generation of longer songs at the same time.
Our study suggests that subword tokenization is a promising technique for symbolic music generation and may have broader implications for music composition.
arXiv Detail & Related papers (2023-04-18T12:46:12Z) - Byte Pair Encoding for Symbolic Music [0.0]
Byte Pair embeddings significantly decreases the sequence length while increasing the vocabulary size.
We leverage the embedding capabilities of such models with more expressive tokens, resulting in both better results and faster inference in generation and classification tasks.
The source code is shared on Github, along with a companion website.
arXiv Detail & Related papers (2023-01-27T20:22:18Z) - Symphony Generation with Permutation Invariant Language Model [57.75739773758614]
We present a symbolic symphony music generation solution, SymphonyNet, based on a permutation invariant language model.
A novel transformer decoder architecture is introduced as backbone for modeling extra-long sequences of symphony tokens.
Our empirical results show that our proposed approach can generate coherent, novel, complex and harmonious symphony compared to human composition.
arXiv Detail & Related papers (2022-05-10T13:08:49Z) - DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models [25.15855175804765]
DadaGP is a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres.
DadaGP is released with an encoder/decoder which converts GuitarPro files to tokens and back.
We present results of a use case in which DadaGP is used to train a Transformer-based model to generate new songs in GuitarPro format.
arXiv Detail & Related papers (2021-07-30T14:21:36Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments.
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.
We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.