N-Gram Unsupervised Compoundation and Feature Injection for Better
Symbolic Music Understanding
- URL: http://arxiv.org/abs/2312.08931v2
- Date: Fri, 15 Dec 2023 03:27:30 GMT
- Title: N-Gram Unsupervised Compoundation and Feature Injection for Better
Symbolic Music Understanding
- Authors: Jinhao Tian, Zuchao Li, Jiajia Li, Ping Wang
- Abstract summary: Music sequences exhibit strong correlations between adjacent elements, making them prime candidates for N-gram techniques from Natural Language Processing (NLP)
In this paper, we propose a novel method, NG-Midiformer, for understanding symbolic music sequences that leverages the N-gram approach.
- Score: 27.554853901252084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The first step to apply deep learning techniques for symbolic music
understanding is to transform musical pieces (mainly in MIDI format) into
sequences of predefined tokens like note pitch, note velocity, and chords.
Subsequently, the sequences are fed into a neural sequence model to accomplish
specific tasks. Music sequences exhibit strong correlations between adjacent
elements, making them prime candidates for N-gram techniques from Natural
Language Processing (NLP). Consider classical piano music: specific melodies
might recur throughout a piece, with subtle variations each time. In this
paper, we propose a novel method, NG-Midiformer, for understanding symbolic
music sequences that leverages the N-gram approach. Our method involves first
processing music pieces into word-like sequences with our proposed unsupervised
compoundation, followed by using our N-gram Transformer encoder, which can
effectively incorporate N-gram information to enhance the primary encoder part
for better understanding of music sequences. The pre-training process on
large-scale music datasets enables the model to thoroughly learn the N-gram
information contained within music sequences, and subsequently apply this
information for making inferences during the fine-tuning stage. Experiment on
various datasets demonstrate the effectiveness of our method and achieved
state-of-the-art performance on a series of music understanding downstream
tasks. The code and model weights will be released at
https://github.com/CinqueOrigin/NG-Midiformer.
Related papers
- End-to-end Piano Performance-MIDI to Score Conversion with Transformers [26.900974153235456]
We present an end-to-end deep learning approach that constructs detailed musical scores directly from real-world piano performance-MIDI files.
We introduce a modern transformer-based architecture with a novel tokenized representation for symbolic music data.
Our method is also the first to directly predict notational details like trill marks or stem direction from performance data.
arXiv Detail & Related papers (2024-09-30T20:11:37Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Predicting Music Hierarchies with a Graph-Based Neural Decoder [6.617487928813374]
This paper describes a data-driven framework to parse musical sequences into dependency trees.
dependency trees are hierarchical structures used in music cognition research and music analysis.
One major benefit of this system is that it can be easily integrated into modern deep-learning pipelines.
arXiv Detail & Related papers (2023-06-29T13:59:18Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - Symphony Generation with Permutation Invariant Language Model [57.75739773758614]
We present a symbolic symphony music generation solution, SymphonyNet, based on a permutation invariant language model.
A novel transformer decoder architecture is introduced as backbone for modeling extra-long sequences of symphony tokens.
Our empirical results show that our proposed approach can generate coherent, novel, complex and harmonious symphony compared to human composition.
arXiv Detail & Related papers (2022-05-10T13:08:49Z) - Machine Composition of Korean Music via Topological Data Analysis and
Artificial Neural Network [6.10183951877597]
We present a way of machine composition that trains a machine the composition principle embedded in the given music data instead of directly feeding music pieces.
The colorblackOverlap matrix makes it possible to compose a new music piece algorithmically and also provide a seed music towards the desired artificial neural network.
arXiv Detail & Related papers (2022-03-29T12:11:31Z) - Differential Music: Automated Music Generation Using LSTM Networks with
Representation Based on Melodic and Harmonic Intervals [0.0]
This paper presents a generative AI model for automated music composition with LSTM networks.
It takes a novel approach at encoding musical information which is based on movement in music rather than absolute pitch.
Experimental results show promise as they sound musical and tonal.
arXiv Detail & Related papers (2021-08-23T23:51:08Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Music Generation using Deep Learning [10.155748914174003]
The proposed approach takes ABC notations from the Nottingham dataset and encodes it to beefed as input for the neural networks.
The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence based on the note until a good piece of music is produced.
arXiv Detail & Related papers (2021-05-19T10:27:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.