Related papers: A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems

A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems

URL: http://arxiv.org/abs/2204.08094v1
Date: Sun, 17 Apr 2022 22:10:37 GMT
Title: A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems
Authors: Frank Cwitkowitz, Jonathan Driedger, Zhiyao Duan
Abstract summary: In this work, symbolic tablature is leveraged to estimate the pairwise likelihood of notes on the guitar. The output layer of a baseline tablature transcription model is reformulated, such that an inhibition loss can be incorporated to discourage the co-activation of unlikely note pairs. This naturally enforces playability constraints for guitar, and yields tablature which is more consistent with the symbolic data used to estimate pairwise likelihoods.
Score: 18.247508110198698
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Guitar tablature transcription is an important but understudied problem within the field of music information retrieval. Traditional signal processing approaches offer only limited performance on the task, and there is little acoustic data with transcription labels for training machine learning models. However, guitar transcription labels alone are more widely available in the form of tablature, which is commonly shared among guitarists online. In this work, a collection of symbolic tablature is leveraged to estimate the pairwise likelihood of notes on the guitar. The output layer of a baseline tablature transcription model is reformulated, such that an inhibition loss can be incorporated to discourage the co-activation of unlikely note pairs. This naturally enforces playability constraints for guitar, and yields tablature which is more consistent with the symbolic data used to estimate pairwise likelihoods. With this methodology, we show that symbolic tablature can be used to shape the distribution of a tablature transcription model's predictions, even when little acoustic data is available.

Related papers

Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription [2.3249139042158853]
The Fretting-Transformer is an encoderdecoder model that utilizes a T5 transformer architecture to automate the transcription of MIDI sequences into guitar tablature.<n>By framing the task as a symbolic translation problem, the model addresses key challenges, including string-fret ambiguity and physical playability.
arXiv Detail & Related papers (2025-06-17T06:25:35Z)
TapToTab : Video-Based Guitar Tabs Generation using AI and Audio Analysis [0.0]
This paper introduces an advanced approach leveraging deep learning, specifically YOLO models for real-time fretboard detection. Experimental results demonstrate substantial improvements in detection accuracy and robustness compared to traditional techniques. This paper aims to revolutionize guitar instruction by automating the creation of guitar tabs from video recordings.
arXiv Detail & Related papers (2024-09-13T08:17:15Z)
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling [6.150307957212576]
We introduce a novel deep learning solution to symbolic guitar tablature estimation. We train an encoder-decoder Transformer model in a masked language modeling paradigm to assign notes to strings. The model is first pre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on a curated set of professionally transcribed guitar performances.
arXiv Detail & Related papers (2024-08-09T12:25:23Z)
Unlocking the Transferability of Tokens in Deep Models for Tabular Data [67.11727608815636]
Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks. In this paper, we propose TabToken, a method aims at enhancing the quality of feature tokens. We introduce a contrastive objective that regularizes the tokens, capturing the semantics within and across features.
arXiv Detail & Related papers (2023-10-23T17:53:09Z)
Modeling Bends in Popular Music Guitar Tablatures [49.64902130083662]
Tablature notation is widely used in popular music to transcribe and share guitar musical content. This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard. Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions.
arXiv Detail & Related papers (2023-08-22T07:50:58Z)
GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers [14.025337055088102]
We use the DadaGP dataset for guitar tab music generation, a corpus of over 26k songs in GuitarPro and token formats. We introduce methods to condition a Transformer-XL deep learning model to generate guitar tabs based on desired instrumentation and genre. Results indicate that the GTR-CTRL methods provide more flexibility and control for guitar-focused symbolic music generation than an unconditioned model.
arXiv Detail & Related papers (2023-02-10T17:43:03Z)
Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant [72.4512562104361]
We argue that the unlabeled data with pseudo labels can facilitate the learning of representative features in the feature extractor. Motivated by this consideration, we propose a novel framework, Gentle Teaching Assistant (GTA-Seg) to disentangle the effects of pseudo labels on feature extractor and mask predictor.
arXiv Detail & Related papers (2023-01-18T07:11:24Z)
Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles. To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio. We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z)
Unaligned Supervision For Automatic Music Transcription in The Wild [1.2183405753834562]
NoteEM is a method for simultaneously training a transcriber and aligning the scores to their corresponding performances. We report SOTA note-level accuracy of the MAPS dataset, and large favorable margins on cross-dataset evaluations.
arXiv Detail & Related papers (2022-04-28T17:31:43Z)
Graph Attention Transformer Network for Multi-Label Image Classification [50.0297353509294]
We propose a general framework for multi-label image classification that can effectively mine complex inter-label relationships. Our proposed methods can achieve state-of-the-art performance on three datasets.
arXiv Detail & Related papers (2022-03-08T12:39:05Z)
Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect. This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one. This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.