Audio Processing using Pattern Recognition for Music Genre Classification
- URL: http://arxiv.org/abs/2410.14990v1
- Date: Sat, 19 Oct 2024 05:44:05 GMT
- Title: Audio Processing using Pattern Recognition for Music Genre Classification
- Authors: Sivangi Chatterjee, Srishti Ganguly, Avik Bose, Hrithik Raj Prasad, Arijit Ghosal,
- Abstract summary: This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset.
Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country.
The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%.
- Score: 0.0
- License:
- Abstract: This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre. Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country-using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras. The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%. We also analyzed key audio features such as spectral roll-off, spectral centroid, and MFCCs, which helped enhance the model's accuracy. Future work will expand the model to cover all ten genres, investigate advanced methods like Long Short-Term Memory (LSTM) networks and ensemble approaches, and develop a web application for real-time genre classification and playlist generation. This research aims to contribute to improving music recommendation systems and content curation.
Related papers
- Music Genre Classification using Large Language Models [50.750620612351284]
This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs) for music genre classification.
The proposed approach splits audio signals into 20 ms chunks and processes them through convolutional feature encoders.
During inference, predictions on individual chunks are aggregated for a final genre classification.
arXiv Detail & Related papers (2024-10-10T19:17:56Z) - Toward a More Complete OMR Solution [49.74172035862698]
Optical music recognition aims to convert music notation into digital formats.
One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image.
We introduce a music object detector based on YOLOv8, which improves detection performance.
Second, we introduce a supervised training pipeline that completes the notation assembly stage based on detection output.
arXiv Detail & Related papers (2024-08-31T01:09:12Z) - Music Genre Classification: Training an AI model [0.0]
Music genre classification is an area that utilizes machine learning models and techniques for the processing of audio signals.
In this research I explore various machine learning algorithms for the purpose of music genre classification, using features extracted from audio signals.
I aim to asses the robustness of machine learning models for genre classification, and to compare their results.
arXiv Detail & Related papers (2024-05-23T23:07:01Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Music Genre Classification: A Comparative Analysis of CNN and XGBoost
Approaches with Mel-frequency cepstral coefficients and Mel Spectrograms [0.0]
This study investigates the performances of three models: a proposed convolutional neural network (CNN), the VGG16 with fully connected layers (FC), and an eXtreme Gradient Boosting (XGBoost) approach on different features.
The results show that the MFCC XGBoost model outperformed the others. Furthermore, applying data segmentation in the data preprocessing phase can significantly enhance the performance of the CNNs.
arXiv Detail & Related papers (2024-01-09T01:50:31Z) - Music Genre Classification with ResNet and Bi-GRU Using Visual
Spectrograms [4.354842354272412]
The limitations of manual genre classification have highlighted the need for a more advanced system.
Traditional machine learning techniques have shown potential in genre classification, but fail to capture the full complexity of music data.
This study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU)
arXiv Detail & Related papers (2023-07-20T11:10:06Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre
Classification [1.8275108630751844]
Imbalanced music genre classification is a crucial task in the Music Information Retrieval (MIR) field.
Most of the existing models are designed for class-balanced music datasets.
We propose a novel mechanism named Multi-instance Attention (MATT) to boost the performance for identifying tail classes.
arXiv Detail & Related papers (2022-09-09T03:52:44Z) - A Study on Broadcast Networks for Music Genre Classification [0.0]
We study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters.
Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.
arXiv Detail & Related papers (2022-08-25T13:36:43Z) - Lets Play Music: Audio-driven Performance Video Generation [58.77609661515749]
We propose a new task named Audio-driven Per-formance Video Generation (APVG)
APVG aims to synthesize the video of a person playing a certain instrument guided by a given music audio clip.
arXiv Detail & Related papers (2020-11-05T03:13:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.