A Hierarchical Deep Learning Approach for Minority Instrument Detection
- URL: http://arxiv.org/abs/2506.21167v1
- Date: Thu, 26 Jun 2025 11:56:11 GMT
- Title: A Hierarchical Deep Learning Approach for Minority Instrument Detection
- Authors: Dylan Sechet, Francesca Bugiotti, Matthieu Kowalski, Edouard d'Hérouville, Filip Langiewicz,
- Abstract summary: This work presents strategies to integrate hierarchical structures into models and tests a new class of models for hierarchical music prediction.<n>This study showcases more reliable coarse-level instrument detection by bridging the gap between detailed instrument identification and group-level recognition.
- Score: 2.0971479389679337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying instrument activities within audio excerpts is vital in music information retrieval, with significant implications for music cataloging and discovery. Prior deep learning endeavors in musical instrument recognition have predominantly emphasized instrument classes with ample data availability. Recent studies have demonstrated the applicability of hierarchical classification in detecting instrument activities in orchestral music, even with limited fine-grained annotations at the instrument level. Based on the Hornbostel-Sachs classification, such a hierarchical classification system is evaluated using the MedleyDB dataset, renowned for its diversity and richness concerning various instruments and music genres. This work presents various strategies to integrate hierarchical structures into models and tests a new class of models for hierarchical music prediction. This study showcases more reliable coarse-level instrument detection by bridging the gap between detailed instrument identification and group-level recognition, paving the way for further advancements in this domain.
Related papers
- Progressive Rock Music Classification [0.0]
This study investigates the classification of progressive rock music, a genre characterized by complex compositions and diverse instrumentation.<n>We extracted comprehensive audio features, including spectrograms, Mel-Frequency Cepstral Coefficients (MFCCs), chromagrams, and beat positions from song snippets.<n>A winner-take-all voting strategy was employed to aggregate snippet-level predictions into final song classifications.
arXiv Detail & Related papers (2025-04-15T02:48:52Z) - Music Genre Classification: Ensemble Learning with Subcomponents-level Attention [2.553456266022126]
Music Genre Classification is one of the most popular topics in the fields of Music Information Retrieval (MIR) and digital signal processing.<n>This letter introduces a novel approach by combining ensemble learning with attention to sub-components, aiming to enhance the accuracy of identifying music genres.<n>The proposed method has superior advantages in terms of accuracy compared to the other state-of-the-art techniques trained and tested on the GTZAN dataset.
arXiv Detail & Related papers (2024-12-20T06:50:31Z) - Low-Data Classification of Historical Music Manuscripts: A Few-Shot Learning Approach [0.0]
We develop a self-supervised learning framework for the classification of musical symbols in historical manuscripts.
We overcome this challenge by training a neural-based feature extractor on unlabelled data, enabling effective classification with minimal samples.
arXiv Detail & Related papers (2024-11-25T14:14:25Z) - Improving Musical Instrument Classification with Advanced Machine Learning Techniques [0.0]
Recent advances in machine learning, specifically deep learning, have enhanced the capability to identify and classify musical instruments from audio signals.
This study applies various machine learning methods, including Naive Bayes, Support Vector Machines, Random Forests, Boosting techniques like AdaBoost and XGBoost.
The effectiveness of these methods is evaluated on the N Synth dataset, a large repository of annotated musical sounds.
arXiv Detail & Related papers (2024-11-01T00:13:46Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance [88.0355290619761]
This work focuses on the separation of unknown musical instruments.
We propose the Separation-with-Consistency (SeCo) framework, which can accomplish the separation on unknown categories.
Our framework exhibits strong adaptation ability on the novel musical categories and outperforms the baseline methods by a significant margin.
arXiv Detail & Related papers (2022-03-25T09:42:11Z) - Leveraging Hierarchical Structures for Few-Shot Musical Instrument
Recognition [9.768677073327423]
We exploit hierarchical relationships between instruments in a few-shot learning setup to enable classification of a wider set of musical instruments.
Compared to a non-hierarchical few-shot baseline, our method leads to a significant increase in classification accuracy and significant decrease mistake severity on instrument classes unseen in training.
arXiv Detail & Related papers (2021-07-14T22:50:24Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z) - Visual Attention for Musical Instrument Recognition [72.05116221011949]
We explore the use of attention mechanism in a timbral-temporal sense, a la visual attention, to improve the performance of musical instrument recognition.
The first approach applies attention mechanism to the sliding-window paradigm, where a prediction based on each timbral-temporal instance' is given an attention weight, before aggregation to produce the final prediction.
The second approach is based on a recurrent model of visual attention where the network only attends to parts of the spectrogram and decide where to attend to next, given a limited number of glimpses'
arXiv Detail & Related papers (2020-06-17T03:56:44Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.