Variable-Length Music Score Infilling via XLNet and Musically
Specialized Positional Encoding
- URL: http://arxiv.org/abs/2108.05064v1
- Date: Wed, 11 Aug 2021 07:07:21 GMT
- Title: Variable-Length Music Score Infilling via XLNet and Musically
Specialized Positional Encoding
- Authors: Chin-Jui Chang and Chun-Yi Lee and Yi-Hsuan Yang
- Abstract summary: This paper proposes a new self-attention based model for music score infilling.
It generates a polyphonic music sequence that fills in the gap between given past and future contexts.
- Score: 37.725607373307646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a new self-attention based model for music score
infilling, i.e., to generate a polyphonic music sequence that fills in the gap
between given past and future contexts. While existing approaches can only fill
in a short segment with a fixed number of notes, or a fixed time span between
the past and future contexts, our model can infill a variable number of notes
(up to 128) for different time spans. We achieve so with three major technical
contributions. First, we adapt XLNet, an autoregressive model originally
proposed for unsupervised model pre-training, to music score infilling. Second,
we propose a new, musically specialized positional encoding called relative bar
encoding that better informs the model of notes' position within the past and
future context. Third, to capitalize relative bar encoding, we perform
look-ahead onset prediction to predict the onset of a note one time step before
predicting the other attributes of the note. We compare our proposed model with
two strong baselines and show that our model is superior in both objective and
subjective analyses.
Related papers
- Tempo estimation as fully self-supervised binary classification [6.255143207183722]
We propose a fully self-supervised approach that does not rely on any human labeled data.
Our method builds on the fact that generic (music) audio embeddings already encode a variety of properties, including information about tempo.
arXiv Detail & Related papers (2024-01-17T00:15:16Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Checklist Models for Improved Output Fluency in Piano Fingering
Prediction [33.52847881359949]
We present a new approach for the task of predicting fingerings for piano music.
We put forward a checklist system, trained via reinforcement learning, that maintains a representation of recent predictions.
We demonstrate significant gains in performability directly attributable to improvements with respect to these metrics.
arXiv Detail & Related papers (2022-09-12T21:27:52Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - Generating Music with a Self-Correcting Non-Chronological Autoregressive
Model [6.289267097017553]
We describe a novel approach for generating music using a self-correcting, non-chronological, autoregressive model.
We represent music as a sequence of edit events, each of which denotes either the addition or removal of a note.
During inference, we generate one edit event at a time using direct ancestral sampling.
arXiv Detail & Related papers (2020-08-18T20:36:47Z) - Unconditional Audio Generation with Generative Adversarial Networks and
Cycle Regularization [48.55126268721948]
We present a generative adversarial network (GAN)-based model for unconditional generation of the mel-spectrograms of singing voices.
We employ a hierarchical architecture in the generator to induce some structure in the temporal dimension.
We evaluate the performance of the new model not only for generating singing voices, but also for generating speech voices.
arXiv Detail & Related papers (2020-05-18T08:35:16Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z) - Continuous Melody Generation via Disentangled Short-Term Representations
and Structural Conditions [14.786601824794369]
We present a model for composing melodies given a user specified symbolic scenario combined with a previous music context.
Our model is capable of generating long melodies by regarding 8-beat note sequences as basic units, and shares consistent rhythm pattern structure with another specific song.
Results show that the music generated by our model tends to have salient repetition structures, rich motives, and stable rhythm patterns.
arXiv Detail & Related papers (2020-02-05T06:23:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.