Graph-based Polyphonic Multitrack Music Generation
- URL: http://arxiv.org/abs/2307.14928v1
- Date: Thu, 27 Jul 2023 15:18:50 GMT
- Title: Graph-based Polyphonic Multitrack Music Generation
- Authors: Emanuele Cosenza, Andrea Valenti, Davide Bacciu
- Abstract summary: This paper introduces a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately.
By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times.
- Score: 9.701208207491879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graphs can be leveraged to model polyphonic multitrack symbolic music, where
notes, chords and entire sections may be linked at different levels of the
musical hierarchy by tonal and rhythmic relationships. Nonetheless, there is a
lack of works that consider graph representations in the context of deep
learning systems for music generation. This paper bridges this gap by
introducing a novel graph representation for music and a deep Variational
Autoencoder that generates the structure and the content of musical graphs
separately, one after the other, with a hierarchical architecture that matches
the structural priors of music. By separating the structure and content of
musical graphs, it is possible to condition generation by specifying which
instruments are played at certain times. This opens the door to a new form of
human-computer interaction in the context of music co-creation. After training
the model on existing MIDI datasets, the experiments show that the model is
able to generate appealing short and long musical sequences and to
realistically interpolate between them, producing music that is tonally and
rhythmically consistent. Finally, the visualization of the embeddings shows
that the model is able to organize its latent space in accordance with known
musical concepts.
Related papers
- MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Combinatorial music generation model with song structure graph analysis [18.71152526968065]
We construct a graph that uses information such as note sequence and instrument as node features, while the correlation between note sequences acts as the edge feature.
We trained a Graph Neural Network to obtain node representation in the graph, then we use node representation as input of Unet to generate CONLON pianoroll image latent.
arXiv Detail & Related papers (2023-12-24T04:09:30Z) - GETMusic: Generating Any Music Tracks with a Unified Representation and
Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music.
We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks''
GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time.
Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z) - Structure-Enhanced Pop Music Generation via Harmony-Aware Learning [20.06867705303102]
We propose to leverage harmony-aware learning for structure-enhanced pop music generation.
Results of subjective and objective evaluations demonstrate that Harmony-Aware Hierarchical Music Transformer (HAT) significantly improves the quality of generated music.
arXiv Detail & Related papers (2021-09-14T05:04:13Z) - Controllable deep melody generation via hierarchical music structure
representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody.
To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks.
To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Music Generation with Temporal Structure Augmentation [0.0]
The proposed method augments a connectionist generation model with count-down to song conclusion and meter markers as extra input features.
An RNN architecture with LSTM cells is trained on the Nottingham folk music dataset in a supervised sequence learning setup.
Experiments show an improved prediction performance for both types of annotation.
arXiv Detail & Related papers (2020-04-21T19:19:58Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.