Related papers: Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods

Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods

URL: http://arxiv.org/abs/2303.13881v1
Date: Fri, 24 Mar 2023 09:45:11 GMT
Title: Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods
Authors: Carlos Hernandez-Olivan, Sonia Rubio Llamas, Jose R. Beltran
Abstract summary: We propose three methods to segment symbolic music by its form or structure: Norm, G-PELT and G-Window. We have found that encoding symbolic music with graph representations and computing the novelty of Adjacency Matrices represent the structure of symbolic music pieces well.
Score: 1.1677169430445211
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Music Structure Analysis is an open research task in Music Information Retrieval (MIR). In the past, there have been several works that attempt to segment music into the audio and symbolic domains, however, the identification and segmentation of the music structure at different levels is still an open research problem in this area. In this work we propose three methods, two of which are novel graph-based algorithms that aim to segment symbolic music by its form or structure: Norm, G-PELT and G-Window. We performed an ablation study with two public datasets that have different forms or structures in order to compare such methods varying their parameter values and comparing the performance against different music styles. We have found that encoding symbolic music with graph representations and computing the novelty of Adjacency Matrices obtained from graphs represent the structure of symbolic music pieces well without the need to extract features from it. We are able to detect the boundaries with an online unsupervised changepoint detection method with a F_1 of 0.5640 for a 1 bar tolerance in one of the public datasets that we used for testing our methods. We also provide the performance results of the algorithms at different levels of structure, high, medium and low, to show how the parameters of the proposed methods have to be adjusted depending on the level. We added the best performing method with its parameters for each structure level to musicaiz, an open source python package, to facilitate the reproducibility and usability of this work. We hope that this methods could be used to improve other MIR tasks such as music generation with structure, music classification or key changes detection.

Related papers

Progressive Rock Music Classification [0.0]
This study investigates the classification of progressive rock music, a genre characterized by complex compositions and diverse instrumentation. We extracted comprehensive audio features, including spectrograms, Mel-Frequency Cepstral Coefficients (MFCCs), chromagrams, and beat positions from song snippets. A winner-take-all voting strategy was employed to aggregate snippet-level predictions into final song classifications.
arXiv Detail & Related papers (2025-04-15T02:48:52Z)
Exploring Tokenization Methods for Multitrack Sheet Music Generation [48.8206920811097]
This study explores the tokenization of multitrack sheet music in ABC notation. In terms of both computational efficiency and musicality, experimental results show that bar-stream patching performs best overall.
arXiv Detail & Related papers (2024-10-23T06:19:48Z)
Toward a More Complete OMR Solution [49.74172035862698]
Optical music recognition aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image. We introduce a music object detector based on YOLOv8, which improves detection performance. Second, we introduce a supervised training pipeline that completes the notation assembly stage based on detection output.
arXiv Detail & Related papers (2024-08-31T01:09:12Z)
Music Genre Classification: A Comparative Analysis of CNN and XGBoost Approaches with Mel-frequency cepstral coefficients and Mel Spectrograms [0.0]
This study investigates the performances of three models: a proposed convolutional neural network (CNN), the VGG16 with fully connected layers (FC), and an eXtreme Gradient Boosting (XGBoost) approach on different features. The results show that the MFCC XGBoost model outperformed the others. Furthermore, applying data segmentation in the data preprocessing phase can significantly enhance the performance of the CNNs.
arXiv Detail & Related papers (2024-01-09T01:50:31Z)
GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music. We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks'' GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time. Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z)
Learning Hierarchical Metrical Structure Beyond Measures [3.7294116330265394]
hierarchical structure annotations are helpful for music information retrieval and computer musicology. We propose a data-driven approach to automatically extract hierarchical metrical structures from scores. We show by experiments that the proposed method performs better than the rule-based approach under different orchestration settings.
arXiv Detail & Related papers (2022-09-21T11:08:52Z)
Cadence Detection in Symbolic Classical Music using Graph Neural Networks [7.817685358710508]
We present a graph representation of symbolic scores as an intermediate means to solve the cadence detection task. We approach cadence detection as an imbalanced node classification problem using a Graph Convolutional Network. Our experiments suggest that graph convolution can learn non-local features that assist in cadence detection, freeing us from the need of having to devise specialized features that encode non-local context.
arXiv Detail & Related papers (2022-08-31T12:39:57Z)
Unsupervised Learning of Deep Features for Music Segmentation [8.528384027684192]
Music segmentation is a problem of identifying boundaries between, and labeling, distinct music segments. The performance of a range of music segmentation algorithms has been dependent on the audio features chosen to represent the audio. In this work, unsupervised training of deep feature embeddings using convolutional neural networks (CNNs) is explored for music segmentation.
arXiv Detail & Related papers (2021-08-30T01:55:44Z)
Structure-Aware Audio-to-Score Alignment using Progressively Dilated Convolutional Neural Networks [8.669338893753885]
The identification of structural differences between a music performance and the score is a challenging yet integral step of audio-to-score alignment. We present a novel method to detect such differences using progressively dilated convolutional neural networks.
arXiv Detail & Related papers (2021-01-31T05:14:58Z)
A framework to compare music generative models using automatic evaluation metrics extended to rhythm [69.2737664640826]
This paper takes the framework proposed in a previous research that did not consider rhythm to make a series of design decisions, then, rhythm support is added to evaluate the performance of two RNN memory cells in the creation of monophonic music. The model considers the handling of music transposition and the framework evaluates the quality of the generated pieces using automatic quantitative metrics based on geometry which have rhythm support added as well.
arXiv Detail & Related papers (2021-01-19T15:04:46Z)
dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains. This will also provide a means for evaluating algorithms specifically designed for music. The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z)
Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music. We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.