Deep Neural Network for Musical Instrument Recognition using MFCCs
- URL: http://arxiv.org/abs/2105.00933v2
- Date: Wed, 5 May 2021 13:32:28 GMT
- Title: Deep Neural Network for Musical Instrument Recognition using MFCCs
- Authors: Saranga Kingkor Mahanta, Abdullah Faiz Ur Rahman Khilji, Partha Pakray
- Abstract summary: Musical instrument recognition is the task of instrument identification by virtue of its audio.
In this paper, we use an artificial neural network (ANN) model that was trained to perform classification on twenty different classes of musical instruments.
- Score: 0.6445605125467573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of efficient automatic music classification is of vital importance
and forms the basis for various advanced applications of AI in the musical
domain. Musical instrument recognition is the task of instrument identification
by virtue of its audio. This audio, also termed as the sound vibrations are
leveraged by the model to match with the instrument classes. In this paper, we
use an artificial neural network (ANN) model that was trained to perform
classification on twenty different classes of musical instruments. Here we use
use only the mel-frequency cepstral coefficients (MFCCs) of the audio data. Our
proposed model trains on the full London philharmonic orchestra dataset which
contains twenty classes of instruments belonging to the four families viz.
woodwinds, brass, percussion, and strings. Based on experimental results our
model achieves state-of-the-art accuracy on the same.
Related papers
- Improving Musical Instrument Classification with Advanced Machine Learning Techniques [0.0]
Recent advances in machine learning, specifically deep learning, have enhanced the capability to identify and classify musical instruments from audio signals.
This study applies various machine learning methods, including Naive Bayes, Support Vector Machines, Random Forests, Boosting techniques like AdaBoost and XGBoost.
The effectiveness of these methods is evaluated on the N Synth dataset, a large repository of annotated musical sounds.
arXiv Detail & Related papers (2024-11-01T00:13:46Z) - Music Genre Classification: Training an AI model [0.0]
Music genre classification is an area that utilizes machine learning models and techniques for the processing of audio signals.
In this research I explore various machine learning algorithms for the purpose of music genre classification, using features extracted from audio signals.
I aim to asses the robustness of machine learning models for genre classification, and to compare their results.
arXiv Detail & Related papers (2024-05-23T23:07:01Z) - Performance Conditioning for Diffusion-Based Multi-Instrument Music
Synthesis [15.670399197114012]
We propose enhancing control of multi-instrument synthesis by conditioning a generative model on a specific performance and recording environment.
Performance conditioning is a tool indicating the generative model to synthesize music with style and timbre of specific instruments taken from specific performances.
Our prototype is evaluated using uncurated performances with diverse instrumentation and state-of-the-art FAD realism scores.
arXiv Detail & Related papers (2023-09-21T17:44:57Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training [74.32603591331718]
We propose an acoustic Music undERstanding model with large-scale self-supervised Training (MERT), which incorporates teacher models to provide pseudo labels in the masked language modelling (MLM) style acoustic pre-training.
Experimental results indicate that our model can generalise and perform well on 14 music understanding tasks and attain state-of-the-art (SOTA) overall scores.
arXiv Detail & Related papers (2023-05-31T18:27:43Z) - Timbre Classification of Musical Instruments with a Deep Learning
Multi-Head Attention-Based Model [1.7188280334580197]
The aim of this work is to define a model that is able to identify different instrument timbres with as few parameters as possible.
It has been possible to assess the ability to classify instruments by timbre even if the instruments are playing the same note with the same intensity.
arXiv Detail & Related papers (2021-07-13T16:34:19Z) - Towards Automatic Instrumentation by Learning to Separate Parts in
Symbolic Multitrack Music [33.679951600368405]
We study the feasibility of automatic instrumentation -- dynamically assigning instruments to notes in solo music during performance.
In addition to the online, real-time-capable setting for performative use cases, automatic instrumentation can also find applications in assistive composing tools in an offline setting.
We frame the task of part separation as a sequential multi-class classification problem and adopt machine learning to map sequences of notes into sequences of part labels.
arXiv Detail & Related papers (2021-07-13T08:34:44Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Fast accuracy estimation of deep learning based multi-class musical
source separation [79.10962538141445]
We propose a method to evaluate the separability of instruments in any dataset without training and tuning a neural network.
Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches.
arXiv Detail & Related papers (2020-10-19T13:05:08Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.