TapToTab : Video-Based Guitar Tabs Generation using AI and Audio Analysis
- URL: http://arxiv.org/abs/2409.08618v1
- Date: Fri, 13 Sep 2024 08:17:15 GMT
- Title: TapToTab : Video-Based Guitar Tabs Generation using AI and Audio Analysis
- Authors: Ali Ghaleb, Eslam ElSadawy, Ihab Essam, Mohamed Abdelhakim, Seif-Eldin Zaki, Natalie Fahim, Razan Bayoumi, Hanan Hindy,
- Abstract summary: This paper introduces an advanced approach leveraging deep learning, specifically YOLO models for real-time fretboard detection.
Experimental results demonstrate substantial improvements in detection accuracy and robustness compared to traditional techniques.
This paper aims to revolutionize guitar instruction by automating the creation of guitar tabs from video recordings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The automation of guitar tablature generation from video inputs holds significant promise for enhancing music education, transcription accuracy, and performance analysis. Existing methods face challenges with consistency and completeness, particularly in detecting fretboards and accurately identifying notes. To address these issues, this paper introduces an advanced approach leveraging deep learning, specifically YOLO models for real-time fretboard detection, and Fourier Transform-based audio analysis for precise note identification. Experimental results demonstrate substantial improvements in detection accuracy and robustness compared to traditional techniques. This paper outlines the development, implementation, and evaluation of these methodologies, aiming to revolutionize guitar instruction by automating the creation of guitar tabs from video recordings.
Related papers
- Audio-to-Score Conversion Model Based on Whisper methodology [0.0]
This thesis innovatively introduces the "Orpheus' Score", a custom notation system that converts music information into tokens.
Experiments show that compared to traditional algorithms, the model has significantly improved accuracy and performance.
arXiv Detail & Related papers (2024-10-22T17:31:37Z) - Toward a More Complete OMR Solution [49.74172035862698]
Optical music recognition aims to convert music notation into digital formats.
One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image.
We introduce a music object detector based on YOLOv8, which improves detection performance.
Second, we introduce a supervised training pipeline that completes the notation assembly stage based on detection output.
arXiv Detail & Related papers (2024-08-31T01:09:12Z) - MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling [6.150307957212576]
We introduce a novel deep learning solution to symbolic guitar tablature estimation.
We train an encoder-decoder Transformer model in a masked language modeling paradigm to assign notes to strings.
The model is first pre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on a curated set of professionally transcribed guitar performances.
arXiv Detail & Related papers (2024-08-09T12:25:23Z) - From MIDI to Rich Tablatures: an Automatic Generative System incorporating Lead Guitarists' Fingering and Stylistic choices [42.362388367152256]
We propose a system that can generate, from simple MIDI melodies, tablatures enriched by fingerings, articulations, and expressive techniques.
The quality of the tablatures derived and the high configurability of the proposed approach can have several impacts.
arXiv Detail & Related papers (2024-07-12T07:18:24Z) - MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - Modeling Bends in Popular Music Guitar Tablatures [49.64902130083662]
Tablature notation is widely used in popular music to transcribe and share guitar musical content.
This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard.
Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions.
arXiv Detail & Related papers (2023-08-22T07:50:58Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Anomalous Sound Detection using Audio Representation with Machine ID
based Contrastive Learning Pretraining [52.191658157204856]
This paper uses contrastive learning to refine audio representations for each machine ID, rather than for each audio sample.
The proposed two-stage method uses contrastive learning to pretrain the audio representation model.
Experiments show that our method outperforms the state-of-the-art methods using contrastive learning or self-supervised classification.
arXiv Detail & Related papers (2023-04-07T11:08:31Z) - GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music
Generation with Transformers [14.025337055088102]
We use the DadaGP dataset for guitar tab music generation, a corpus of over 26k songs in GuitarPro and token formats.
We introduce methods to condition a Transformer-XL deep learning model to generate guitar tabs based on desired instrumentation and genre.
Results indicate that the GTR-CTRL methods provide more flexibility and control for guitar-focused symbolic music generation than an unconditioned model.
arXiv Detail & Related papers (2023-02-10T17:43:03Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Unaligned Supervision For Automatic Music Transcription in The Wild [1.2183405753834562]
NoteEM is a method for simultaneously training a transcriber and aligning the scores to their corresponding performances.
We report SOTA note-level accuracy of the MAPS dataset, and large favorable margins on cross-dataset evaluations.
arXiv Detail & Related papers (2022-04-28T17:31:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.