Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach
- URL: http://arxiv.org/abs/2408.00473v1
- Date: Thu, 1 Aug 2024 11:23:42 GMT
- Title: Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach
- Authors: Pedro Ramoneda, Vsevolod Eremenko, Alexandre D'Hooge, Emilia Parada-Cabaleiro, Xavier Serra,
- Abstract summary: Estimating music piece difficulty is important for organizing educational music collections.
Our work employs explainable descriptors for difficulty estimation in symbolic music representations.
Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7.
- Score: 49.2787113554916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating music piece difficulty is important for organizing educational music collections. This process could be partially automatized to facilitate the educator's role. Nevertheless, the decisions performed by prevalent deep-learning models are hardly understandable, which may impair the acceptance of such a technology in music education curricula. Our work employs explainable descriptors for difficulty estimation in symbolic music representations. Furthermore, through a novel parameter-efficient white-box model, we outperform previous efforts while delivering interpretable results. These comprehensible outcomes emulate the functionality of a rubric, a tool widely used in music education. Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7, showing precise difficulty estimation. Through our baseline, we illustrate how building on top of past research can offer alternatives for music difficulty assessment which are explainable and interpretable. With this, we aim to promote a more effective communication between the Music Information Retrieval (MIR) community and the music education one.
Related papers
- Evaluation of pretrained language models on music understanding [0.0]
We demonstrate that Large Language Models (LLM) suffer from 1) prompt sensitivity, 2) inability to model negation, and 3) sensitivity towards the presence of specific words.
We quantified these properties as a triplet-based accuracy, evaluating the ability to model the relative similarity of labels in a hierarchical ontology.
Despite the relatively high accuracy reported, inconsistencies are evident in all six models, suggesting that off-the-shelf LLMs need adaptation to music before use.
arXiv Detail & Related papers (2024-09-17T14:44:49Z) - Self-Supervised Contrastive Learning for Robust Audio-Sheet Music
Retrieval Systems [3.997809845676912]
We show that self-supervised contrastive learning can mitigate the scarcity of annotated data from real music content.
We employ the snippet embeddings in the higher-level task of cross-modal piece identification.
In this work, we observe that the retrieval quality improves from 30% up to 100% when real music data is present.
arXiv Detail & Related papers (2023-09-21T14:54:48Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Music Instrument Classification Reprogrammed [79.68916470119743]
"Reprogramming" is a technique that utilizes pre-trained deep and complex neural networks originally targeting a different task by modifying and mapping both the input and output of the pre-trained model.
We demonstrate that reprogramming can effectively leverage the power of the representation learned for a different task and that the resulting reprogrammed system can perform on par or even outperform state-of-the-art systems at a fraction of training parameters.
arXiv Detail & Related papers (2022-11-15T18:26:01Z) - Supervised and Unsupervised Learning of Audio Representations for Music
Understanding [9.239657838690226]
We show how the domain of pre-training datasets affects the adequacy of the resulting audio embeddings for downstream tasks.
We show that models trained via supervised learning on large-scale expert-annotated music datasets achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-10-07T20:07:35Z) - Contrastive Learning with Positive-Negative Frame Mask for Music
Representation [91.44187939465948]
This paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR.
We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music.
arXiv Detail & Related papers (2022-03-17T07:11:42Z) - Learning music audio representations via weak language supervision [14.335950077921435]
We design a multimodal architecture for music and language pre-training (MuLaP) optimised via a set of proxy tasks.
weak supervision is provided in the form of noisy natural language descriptions conveying the overall musical content of the track.
We demonstrate the usefulness of our approach by comparing the performance of audio representations produced by the same audio backbone with different training strategies.
arXiv Detail & Related papers (2021-12-08T10:30:52Z) - Multi-task Learning with Metadata for Music Mood Classification [0.0]
Mood recognition is an important problem in music informatics and has key applications in music discovery and recommendation.
We propose a multi-task learning approach in which a shared model is simultaneously trained for mood and metadata prediction tasks.
Applying our technique on the existing state-of-the-art convolutional neural networks for mood classification improves their performances consistently.
arXiv Detail & Related papers (2021-10-10T11:36:34Z) - Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach [74.09517278785519]
We introduce a grading function that evaluates four-part chorales in the style of J.S. Bach along important musical features.
We show that the function is both interpretable and outperforms human experts at discriminating Bach chorales from model-generated ones.
arXiv Detail & Related papers (2020-06-23T21:02:55Z) - Audio Impairment Recognition Using a Correlation-Based Feature
Representation [85.08880949780894]
We propose a new representation of hand-crafted features that is based on the correlation of feature pairs.
We show superior performance in terms of compact feature dimensionality and improved computational speed in the test stage.
arXiv Detail & Related papers (2020-03-22T13:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.