A Study on Broadcast Networks for Music Genre Classification
- URL: http://arxiv.org/abs/2208.12086v1
- Date: Thu, 25 Aug 2022 13:36:43 GMT
- Title: A Study on Broadcast Networks for Music Genre Classification
- Authors: Ahmed Heakl, Abdelrahman Abdelgawad, Victor Parque
- Abstract summary: We study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters.
Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the increased demand for music streaming/recommender services and the
recent developments of music information retrieval frameworks, Music Genre
Classification (MGC) has attracted the community's attention. However,
convolutional-based approaches are known to lack the ability to efficiently
encode and localize temporal features. In this paper, we study the
broadcast-based neural networks aiming to improve the localization and
generalizability under a small set of parameters (about 180k) and investigate
twelve variants of broadcast networks discussing the effect of block
configuration, pooling method, activation function, normalization mechanism,
label smoothing, channel interdependency, LSTM block inclusion, and variants of
inception schemes. Our computational experiments using relevant datasets such
as GTZAN, Extended Ballroom, HOMBURG, and Free Music Archive (FMA) show
state-of-the-art classification accuracies in Music Genre Classification. Our
approach offers insights and the potential to enable compact and generalizable
broadcast networks for music and audio classification.
Related papers
- Audio Processing using Pattern Recognition for Music Genre Classification [0.0]
This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset.
Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country.
The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%.
arXiv Detail & Related papers (2024-10-19T05:44:05Z) - Music Genre Classification using Large Language Models [50.750620612351284]
This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs) for music genre classification.
The proposed approach splits audio signals into 20 ms chunks and processes them through convolutional feature encoders.
During inference, predictions on individual chunks are aggregated for a final genre classification.
arXiv Detail & Related papers (2024-10-10T19:17:56Z) - Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization [50.122441710500055]
Dense-localization Audio-Visual Events (DAVE) aims to identify time boundaries and corresponding categories for events that can be heard and seen concurrently in an untrimmed video.
Existing methods typically encode audio and visual representation separately without any explicit cross-modal alignment constraint.
We present LOCO, a Locality-aware cross-modal Correspondence learning framework for DAVE.
arXiv Detail & Related papers (2024-09-12T11:54:25Z) - Music Genre Classification with ResNet and Bi-GRU Using Visual
Spectrograms [4.354842354272412]
The limitations of manual genre classification have highlighted the need for a more advanced system.
Traditional machine learning techniques have shown potential in genre classification, but fail to capture the full complexity of music data.
This study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU)
arXiv Detail & Related papers (2023-07-20T11:10:06Z) - MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre
Classification [1.8275108630751844]
Imbalanced music genre classification is a crucial task in the Music Information Retrieval (MIR) field.
Most of the existing models are designed for class-balanced music datasets.
We propose a novel mechanism named Multi-instance Attention (MATT) to boost the performance for identifying tail classes.
arXiv Detail & Related papers (2022-09-09T03:52:44Z) - Temporal Saliency Query Network for Efficient Video Recognition [82.52760040577864]
Video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices.
Most existing methods select the salient frames without awareness of the class-specific saliency scores.
We propose a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement.
arXiv Detail & Related papers (2022-07-21T09:23:34Z) - Interpreting Class Conditional GANs with Channel Awareness [57.01413866290279]
We investigate how a class conditional generator unifies the synthesis of multiple classes.
To describe such a phenomenon, we propose channel awareness, which quantitatively characterizes how a single channel contributes to the final synthesis.
Our algorithm enables several novel applications with conditional GANs.
arXiv Detail & Related papers (2022-03-21T17:53:22Z) - Complex Network-Based Approach for Feature Extraction and Classification
of Musical Genres [0.0]
This work presents a feature extraction method for the automatic classification of musical genres.
The proposed method initially converts the musics into sequences of musical notes and then maps the sequences as complex networks.
Topological measurements are extracted to characterize the network topology, which composes a feature vector that applies to the classification of musical genres.
arXiv Detail & Related papers (2021-10-09T22:23:33Z) - EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content.
First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events.
Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.