Perception-Inspired Graph Convolution for Music Understanding Tasks
- URL: http://arxiv.org/abs/2405.09224v1
- Date: Wed, 15 May 2024 10:04:44 GMT
- Title: Perception-Inspired Graph Convolution for Music Understanding Tasks
- Authors: Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer,
- Abstract summary: We propose a new graph convolutional block, MusGConv, specifically designed for the efficient processing of musical score data.
We evaluate our approach on four different musical understanding problems.
- Score: 3.5570874721859016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new graph convolutional block, called MusGConv, specifically designed for the efficient processing of musical score data and motivated by general perceptual principles. It focuses on two fundamental dimensions of music, pitch and rhythm, and considers both relative and absolute representations of these components. We evaluate our approach on four different musical understanding problems: monophonic voice separation, harmonic analysis, cadence detection, and composer identification which, in abstract terms, translate to different graph learning problems, namely, node classification, link prediction, and graph classification. Our experiments demonstrate that MusGConv improves the performance on three of the aforementioned tasks while being conceptually very simple and efficient. We interpret this as evidence that it is beneficial to include perception-informed processing of fundamental musical concepts when developing graph network applications on musical score data.
Related papers
- Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach [49.2787113554916]
Estimating music piece difficulty is important for organizing educational music collections.
Our work employs explainable descriptors for difficulty estimation in symbolic music representations.
Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7.
arXiv Detail & Related papers (2024-08-01T11:23:42Z) - GraphMuse: A Library for Symbolic Music Graph Processing [3.997809845676912]
GraphMuse is a graph processing framework and library that facilitates efficient music graph processing.
New neighbor sampling technique specifically targeted toward meaningful behavior in musical scores is central to our contribution.
Our hope is that GraphMuse will lead to a boost in, and standardization of, symbolic music processing based on graph representations.
arXiv Detail & Related papers (2024-07-17T15:54:09Z) - Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving [5.572472212662453]
This paper approaches the problem of separating the notes from a quantized symbolic music piece (e.g., a MIDI file) into multiple voices and staves.
We propose an end-to-end system based on graph neural networks that notes that belong to the same chord and connect them with edges if they are part of a voice.
arXiv Detail & Related papers (2024-07-15T14:36:13Z) - Visual Commonsense based Heterogeneous Graph Contrastive Learning [79.22206720896664]
We propose a heterogeneous graph contrastive learning method to better finish the visual reasoning task.
Our method is designed as a plug-and-play way, so that it can be quickly and easily combined with a wide range of representative methods.
arXiv Detail & Related papers (2023-11-11T12:01:18Z) - Cadence Detection in Symbolic Classical Music using Graph Neural
Networks [7.817685358710508]
We present a graph representation of symbolic scores as an intermediate means to solve the cadence detection task.
We approach cadence detection as an imbalanced node classification problem using a Graph Convolutional Network.
Our experiments suggest that graph convolution can learn non-local features that assist in cadence detection, freeing us from the need of having to devise specialized features that encode non-local context.
arXiv Detail & Related papers (2022-08-31T12:39:57Z) - Representing Videos as Discriminative Sub-graphs for Action Recognition [165.54738402505194]
We introduce a new design of sub-graphs to represent and encode theriminative patterns of each action in the videos.
We present MUlti-scale Sub-Earn Ling (MUSLE) framework that novelly builds space-time graphs and clusters into compact sub-graphs on each scale.
arXiv Detail & Related papers (2022-01-11T16:15:25Z) - Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods.
We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z) - Tracing Back Music Emotion Predictions to Sound Sources and Intuitive
Perceptual Qualities [6.832341432995627]
Music emotion recognition is an important task in MIR (Music Information Retrieval) research.
One important step towards better models would be to understand what a model is actually learning from the data.
We show how to derive explanations of model predictions in terms of spectrogram image segments that connect to the high-level emotion prediction.
arXiv Detail & Related papers (2021-06-14T22:49:19Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.