Music Genre Classification with ResNet and Bi-GRU Using Visual
Spectrograms
- URL: http://arxiv.org/abs/2307.10773v1
- Date: Thu, 20 Jul 2023 11:10:06 GMT
- Title: Music Genre Classification with ResNet and Bi-GRU Using Visual
Spectrograms
- Authors: Junfei Zhang
- Abstract summary: The limitations of manual genre classification have highlighted the need for a more advanced system.
Traditional machine learning techniques have shown potential in genre classification, but fail to capture the full complexity of music data.
This study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU)
- Score: 4.354842354272412
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music recommendation systems have emerged as a vital component to enhance
user experience and satisfaction for the music streaming services, which
dominates music consumption. The key challenge in improving these recommender
systems lies in comprehending the complexity of music data, specifically for
the underpinning music genre classification. The limitations of manual genre
classification have highlighted the need for a more advanced system, namely the
Automatic Music Genre Classification (AMGC) system. While traditional machine
learning techniques have shown potential in genre classification, they heavily
rely on manually engineered features and feature selection, failing to capture
the full complexity of music data. On the other hand, deep learning
classification architectures like the traditional Convolutional Neural Networks
(CNN) are effective in capturing the spatial hierarchies but struggle to
capture the temporal dynamics inherent in music data. To address these
challenges, this study proposes a novel approach using visual spectrograms as
input, and propose a hybrid model that combines the strength of the Residual
neural Network (ResNet) and the Gated Recurrent Unit (GRU). This model is
designed to provide a more comprehensive analysis of music data, offering the
potential to improve the music recommender systems through achieving a more
comprehensive analysis of music data and hence potentially more accurate genre
classification.
Related papers
- Attention-guided Spectrogram Sequence Modeling with CNNs for Music Genre Classification [0.0]
We present an innovative model for classifying music genres using attention-based temporal signature modeling.
Our approach captures the most temporally significant moments within each piece, crafting a unique "signature" for genre identification.
This work bridges the gap between technical classification tasks and the nuanced, human experience of genre.
arXiv Detail & Related papers (2024-11-18T21:57:03Z) - Audio Processing using Pattern Recognition for Music Genre Classification [0.0]
This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset.
Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country.
The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%.
arXiv Detail & Related papers (2024-10-19T05:44:05Z) - Fairness Through Domain Awareness: Mitigating Popularity Bias For Music
Discovery [56.77435520571752]
We explore the intrinsic relationship between music discovery and popularity bias.
We propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems.
Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations.
arXiv Detail & Related papers (2023-08-28T14:12:25Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre
Classification [1.8275108630751844]
Imbalanced music genre classification is a crucial task in the Music Information Retrieval (MIR) field.
Most of the existing models are designed for class-balanced music datasets.
We propose a novel mechanism named Multi-instance Attention (MATT) to boost the performance for identifying tail classes.
arXiv Detail & Related papers (2022-09-09T03:52:44Z) - A Study on Broadcast Networks for Music Genre Classification [0.0]
We study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters.
Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.
arXiv Detail & Related papers (2022-08-25T13:36:43Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.