Deriving Representative Structure from Music Corpora
- URL: http://arxiv.org/abs/2502.15849v2
- Date: Sun, 30 Mar 2025 22:09:45 GMT
- Title: Deriving Representative Structure from Music Corpora
- Authors: Ilana Shapiro, Ruanqianqian Huang, Zachary Novack, Cheng-i Wang, Hao-Wen Dong, Taylor Berg-Kirkpatrick, Shlomo Dubnov, Sorin Lerner,
- Abstract summary: We propose a unified, hierarchical meta-representation of musical structure called the structural temporal graph (STG)<n>For a single piece, the STG is a data structure that defines a hierarchy of progressively finer structural musical features and the temporal relationships between them.
- Score: 32.18458296933001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Western music is an innately hierarchical system of interacting levels of structure, from fine-grained melody to high-level form. In order to analyze music compositions holistically and at multiple granularities, we propose a unified, hierarchical meta-representation of musical structure called the structural temporal graph (STG). For a single piece, the STG is a data structure that defines a hierarchy of progressively finer structural musical features and the temporal relationships between them. We use the STG to enable a novel approach for deriving a representative structural summary of a music corpus, which we formalize as a dually NP-hard combinatorial optimization problem extending the Generalized Median Graph problem. Our approach first applies simulated annealing to develop a measure of structural distance between two music pieces rooted in graph isomorphism. Our approach then combines the formal guarantees of SMT solvers with nested simulated annealing over structural distances to produce a structurally sound, representative centroid STG for an entire corpus of STGs from individual pieces. To evaluate our approach, we conduct experiments verifying that structural distance accurately differentiates between music pieces, and that derived centroids accurately structurally characterize their corpora.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Learning Correlation Structures for Vision Transformers [93.22434535223587]
We introduce a new attention mechanism, dubbed structural self-attention (StructSA)
We generate attention maps by recognizing space-time structures of key-query correlations via convolution.
This effectively leverages rich structural patterns in images and videos such as scene layouts, object motion, and inter-object relations.
arXiv Detail & Related papers (2024-04-05T07:13:28Z) - Structure-informed Positional Encoding for Music Generation [0.0]
We propose a structure-informed positional encoding framework for music generation with Transformers.
We test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation.
Our methods improve the melodic and structural consistency of the generated pieces.
arXiv Detail & Related papers (2024-02-20T13:41:35Z) - Structured Multi-Track Accompaniment Arrangement via Style Prior Modelling [9.489311894706765]
In this paper, we introduce a novel system that leverages prior modelling over disentangled style factors to address these challenges.
Our key design is the use of vector quantization and a unique multi-stream Transformer to model the long-term flow of the orchestration style.
We show that our system achieves superior coherence, structure, and overall arrangement quality compared to existing baselines.
arXiv Detail & Related papers (2023-10-25T03:30:37Z) - DiffCloth: Diffusion Based Garment Synthesis and Manipulation via
Structural Cross-modal Semantic Alignment [124.57488600605822]
Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments.
We introduce DiffCloth, a diffusion-based pipeline for cross-modal garment synthesis and manipulation.
Experiments on the CM-Fashion benchmark demonstrate that DiffCloth both yields state-of-the-art garment synthesis results.
arXiv Detail & Related papers (2023-08-22T05:43:33Z) - Self-Supervised Temporal Graph learning with Temporal and Structural Intensity Alignment [53.72873672076391]
Temporal graph learning aims to generate high-quality representations for graph-based tasks with dynamic information.
We propose a self-supervised method called S2T for temporal graph learning, which extracts both temporal and structural information.
S2T achieves at most 10.13% performance improvement compared with the state-of-the-art competitors on several datasets.
arXiv Detail & Related papers (2023-02-15T06:36:04Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Music Generation with Temporal Structure Augmentation [0.0]
The proposed method augments a connectionist generation model with count-down to song conclusion and meter markers as extra input features.
An RNN architecture with LSTM cells is trained on the Nottingham folk music dataset in a supervised sequence learning setup.
Experiments show an improved prediction performance for both types of annotation.
arXiv Detail & Related papers (2020-04-21T19:19:58Z) - Continuous Melody Generation via Disentangled Short-Term Representations
and Structural Conditions [14.786601824794369]
We present a model for composing melodies given a user specified symbolic scenario combined with a previous music context.
Our model is capable of generating long melodies by regarding 8-beat note sequences as basic units, and shares consistent rhythm pattern structure with another specific song.
Results show that the music generated by our model tends to have salient repetition structures, rich motives, and stable rhythm patterns.
arXiv Detail & Related papers (2020-02-05T06:23:44Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.