Related papers: A Study on Broadcast Networks for Music Genre Classification

A Study on Broadcast Networks for Music Genre Classification

URL: http://arxiv.org/abs/2208.12086v1
Date: Thu, 25 Aug 2022 13:36:43 GMT
Title: A Study on Broadcast Networks for Music Genre Classification
Authors: Ahmed Heakl, Abdelrahman Abdelgawad, Victor Parque
Abstract summary: We study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters. Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Due to the increased demand for music streaming/recommender services and the recent developments of music information retrieval frameworks, Music Genre Classification (MGC) has attracted the community's attention. However, convolutional-based approaches are known to lack the ability to efficiently encode and localize temporal features. In this paper, we study the broadcast-based neural networks aiming to improve the localization and generalizability under a small set of parameters (about 180k) and investigate twelve variants of broadcast networks discussing the effect of block configuration, pooling method, activation function, normalization mechanism, label smoothing, channel interdependency, LSTM block inclusion, and variants of inception schemes. Our computational experiments using relevant datasets such as GTZAN, Extended Ballroom, HOMBURG, and Free Music Archive (FMA) show state-of-the-art classification accuracies in Music Genre Classification. Our approach offers insights and the potential to enable compact and generalizable broadcast networks for music and audio classification.

Related papers

Extending Visual Dynamics for Video-to-Music Generation [51.274561293909926]
DyViM is a novel framework to enhance dynamics modeling for video-to-music generation. High-level semantics are conveyed through a cross-attention mechanism. Experiments demonstrate DyViM's superiority over state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2025-04-10T09:47:26Z)
Audio Processing using Pattern Recognition for Music Genre Classification [0.0]
This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset. Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country. The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%.
arXiv Detail & Related papers (2024-10-19T05:44:05Z)
Music Genre Classification using Large Language Models [50.750620612351284]
This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs) for music genre classification. The proposed approach splits audio signals into 20 ms chunks and processes them through convolutional feature encoders. During inference, predictions on individual chunks are aggregated for a final genre classification.
arXiv Detail & Related papers (2024-10-10T19:17:56Z)
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization [50.122441710500055]
Dense-localization Audio-Visual Events (DAVE) aims to identify time boundaries and corresponding categories for events that can be heard and seen concurrently in an untrimmed video. Existing methods typically encode audio and visual representation separately without any explicit cross-modal alignment constraint. We present LOCO, a Locality-aware cross-modal Correspondence learning framework for DAVE.
arXiv Detail & Related papers (2024-09-12T11:54:25Z)
Music Genre Classification with ResNet and Bi-GRU Using Visual Spectrograms [4.354842354272412]
The limitations of manual genre classification have highlighted the need for a more advanced system. Traditional machine learning techniques have shown potential in genre classification, but fail to capture the full complexity of music data. This study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU)
arXiv Detail & Related papers (2023-07-20T11:10:06Z)
MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre Classification [1.8275108630751844]
Imbalanced music genre classification is a crucial task in the Music Information Retrieval (MIR) field. Most of the existing models are designed for class-balanced music datasets. We propose a novel mechanism named Multi-instance Attention (MATT) to boost the performance for identifying tail classes.
arXiv Detail & Related papers (2022-09-09T03:52:44Z)
Temporal Saliency Query Network for Efficient Video Recognition [82.52760040577864]
Video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices. Most existing methods select the salient frames without awareness of the class-specific saliency scores. We propose a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement.
arXiv Detail & Related papers (2022-07-21T09:23:34Z)
Interpreting Class Conditional GANs with Channel Awareness [57.01413866290279]
We investigate how a class conditional generator unifies the synthesis of multiple classes. To describe such a phenomenon, we propose channel awareness, which quantitatively characterizes how a single channel contributes to the final synthesis. Our algorithm enables several novel applications with conditional GANs.
arXiv Detail & Related papers (2022-03-21T17:53:22Z)
Complex Network-Based Approach for Feature Extraction and Classification of Musical Genres [0.0]
This work presents a feature extraction method for the automatic classification of musical genres. The proposed method initially converts the musics into sequences of musical notes and then maps the sequences as complex networks. Topological measurements are extracted to characterize the network topology, which composes a feature vector that applies to the classification of musical genres.
arXiv Detail & Related papers (2021-10-09T22:23:33Z)
EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content. First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events. Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z)
Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition. A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z)
Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling. I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments. I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.