A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music
- URL: http://arxiv.org/abs/2404.05765v1
- Date: Sat, 6 Apr 2024 16:15:02 GMT
- Title: A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music
- Authors: Roopa Mayya, Vivekanand Venkataraman, Anwesh P R, Narayana Darapaneni,
- Abstract summary: Methods for generating classical Indian music, specifically tabla music, is proposed.
A novel Bi-LSTM with an Attention approach and a transformer model are trained on the extracted features and labels.
- Score: 2.062778014412626
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Introduction: Music generation is a complex task that has received significant attention in recent years, and deep learning techniques have shown promising results in this field. Objectives: While extensive work has been carried out on generating Piano and other Western music, there is limited research on generating classical Indian music due to the scarcity of Indian music in machine-encoded formats. In this technical paper, methods for generating classical Indian music, specifically tabla music, is proposed. Initially, this paper explores piano music generation using deep learning architectures. Then the fundamentals are extended to generating tabla music. Methods: Tabla music in waveform (.wav) files are pre-processed using the librosa library in Python. A novel Bi-LSTM with an Attention approach and a transformer model are trained on the extracted features and labels. Results: The models are then used to predict the next sequences of tabla music. A loss of 4.042 and MAE of 1.0814 are achieved with the Bi-LSTM model. With the transformer model, a loss of 55.9278 and MAE of 3.5173 are obtained for tabla music generation. Conclusion: The resulting music embodies a harmonious fusion of novelty and familiarity, pushing the limits of music composition to new horizons.
Related papers
- MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT [44.204383306879095]
We propose a novel symbolic music representation and Generative Adversarial Network (GAN) framework specially designed for symbolic multitrack music generation.
To build a robust multitrack music generator, we fine-tune a pre-trained MusicBERT model to serve as the discriminator, and incorporate relativistic standard loss.
arXiv Detail & Related papers (2024-09-02T03:18:56Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Video2Music: Suitable Music Generation from Videos using an Affective
Multimodal Transformer model [32.801213106782335]
We develop a generative music AI framework, Video2Music, that can match a provided video.
In a thorough experiment, we show that our proposed framework can generate music that matches the video content in terms of emotion.
arXiv Detail & Related papers (2023-11-02T03:33:00Z) - From West to East: Who can understand the music of the others better? [91.78564268397139]
We leverage transfer learning methods to derive insights about similarities between different music cultures.
We use two Western music datasets, two traditional/folk datasets coming from eastern Mediterranean cultures, and two datasets belonging to Indian art music.
Three deep audio embedding models are trained and transferred across domains, including two CNN-based and a Transformer-based architecture, to perform auto-tagging for each target domain dataset.
arXiv Detail & Related papers (2023-07-19T07:29:14Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models [25.15855175804765]
DadaGP is a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres.
DadaGP is released with an encoder/decoder which converts GuitarPro files to tokens and back.
We present results of a use case in which DadaGP is used to train a Transformer-based model to generate new songs in GuitarPro format.
arXiv Detail & Related papers (2021-07-30T14:21:36Z) - Using a Bi-directional LSTM Model with Attention Mechanism trained on
MIDI Data for Generating Unique Music [0.25559196081940677]
This paper proposes a bi-directional LSTM model with attention mechanism capable of generating similar type of music based on MIDI data.
The music generated by the model follows the theme/style of the music the model is trained on.
arXiv Detail & Related papers (2020-11-02T06:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.