DeepSRGM -- Sequence Classification and Ranking in Indian Classical
Music with Deep Learning
- URL: http://arxiv.org/abs/2402.10168v1
- Date: Thu, 15 Feb 2024 18:11:02 GMT
- Title: DeepSRGM -- Sequence Classification and Ranking in Indian Classical
Music with Deep Learning
- Authors: Sathwik Tejaswi Madhusudhan and Girish Chowdhary
- Abstract summary: Raga is a melodic framework for compositions and improvisations alike.
Raga Recognition is an important music information retrieval task in Indian Classical Music.
We propose a deep learning based approach to Raga recognition.
- Score: 7.140656816182373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a
melodic framework for compositions and improvisations alike. Raga Recognition
is an important music information retrieval task in ICM as it can aid numerous
downstream applications ranging from music recommendations to organizing huge
music collections. In this work, we propose a deep learning based approach to
Raga recognition. Our approach employs efficient pre possessing and learns
temporal sequences in music data using Long Short Term Memory based Recurrent
Neural Networks (LSTM-RNN). We train and test the network on smaller sequences
sampled from the original audio while the final inference is performed on the
audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during
inference on the Comp Music Carnatic dataset and its 10 Raga subset
respectively making it the state-of-the-art for the Raga recognition task. Our
approach also enables sequence ranking which aids us in retrieving melodic
patterns from a given music data base that are closely related to the presented
query sequence.
Related papers
- Carnatic Raga Identification System using Rigorous Time-Delay Neural Network [0.0]
Large scale machine learning-based Raga identification continues to be a nontrivial issue in the computational aspects behind Carnatic music.
In this paper, the input sound is analyzed using a combination of steps including using a Discrete Fourier transformation and using Triangular Filtering to create custom bins of possible notes.
The goal of this program is to be able to effectively and efficiently label a much wider range of audio clips in more shrutis, ragas, and with more background noise.
arXiv Detail & Related papers (2024-05-25T01:31:58Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - N-Gram Unsupervised Compoundation and Feature Injection for Better
Symbolic Music Understanding [27.554853901252084]
Music sequences exhibit strong correlations between adjacent elements, making them prime candidates for N-gram techniques from Natural Language Processing (NLP)
In this paper, we propose a novel method, NG-Midiformer, for understanding symbolic music sequences that leverages the N-gram approach.
arXiv Detail & Related papers (2023-12-13T06:08:37Z) - Self-Supervised Contrastive Learning for Robust Audio-Sheet Music
Retrieval Systems [3.997809845676912]
We show that self-supervised contrastive learning can mitigate the scarcity of annotated data from real music content.
We employ the snippet embeddings in the higher-level task of cross-modal piece identification.
In this work, we observe that the retrieval quality improves from 30% up to 100% when real music data is present.
arXiv Detail & Related papers (2023-09-21T14:54:48Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - Multi-task Learning with Metadata for Music Mood Classification [0.0]
Mood recognition is an important problem in music informatics and has key applications in music discovery and recommendation.
We propose a multi-task learning approach in which a shared model is simultaneously trained for mood and metadata prediction tasks.
Applying our technique on the existing state-of-the-art convolutional neural networks for mood classification improves their performances consistently.
arXiv Detail & Related papers (2021-10-10T11:36:34Z) - Unsupervised Learning of Deep Features for Music Segmentation [8.528384027684192]
Music segmentation is a problem of identifying boundaries between, and labeling, distinct music segments.
The performance of a range of music segmentation algorithms has been dependent on the audio features chosen to represent the audio.
In this work, unsupervised training of deep feature embeddings using convolutional neural networks (CNNs) is explored for music segmentation.
arXiv Detail & Related papers (2021-08-30T01:55:44Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z) - Deep Learning for MIR Tutorial [68.8204255655161]
The tutorial covers a wide range of MIR relevant deep learning approaches.
textbfConvolutional Neural Networks are currently a de-facto standard for deep learning based audio retrieval.
textbfSiamese Networks have been shown effective in learning audio representations and distance functions specific for music similarity retrieval.
arXiv Detail & Related papers (2020-01-15T12:23:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.