Pitchclass2vec: Symbolic Music Structure Segmentation with Chord
Embeddings
- URL: http://arxiv.org/abs/2303.15306v1
- Date: Fri, 24 Mar 2023 10:23:15 GMT
- Title: Pitchclass2vec: Symbolic Music Structure Segmentation with Chord
Embeddings
- Authors: Nicolas Lazzari, Andrea Poltronieri, Valentina Presutti
- Abstract summary: We present a novel music segmentation method, pitchclass2vec, based on symbolic chord annotations.
Our algorithm is based on long-short term memory (LSTM) neural network and outperforms the state-of-the-art techniques based on symbolic chord annotations in the field.
- Score: 0.8701566919381222
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structure perception is a fundamental aspect of music cognition in humans.
Historically, the hierarchical organization of music into structures served as
a narrative device for conveying meaning, creating expectancy, and evoking
emotions in the listener. Thereby, musical structures play an essential role in
music composition, as they shape the musical discourse through which the
composer organises his ideas. In this paper, we present a novel music
segmentation method, pitchclass2vec, based on symbolic chord annotations, which
are embedded into continuous vector representations using both natural language
processing techniques and custom-made encodings. Our algorithm is based on
long-short term memory (LSTM) neural network and outperforms the
state-of-the-art techniques based on symbolic chord annotations in the field.
Related papers
- MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT [44.204383306879095]
We propose a novel symbolic music representation and Generative Adversarial Network (GAN) framework specially designed for symbolic multitrack music generation.
To build a robust multitrack music generator, we fine-tune a pre-trained MusicBERT model to serve as the discriminator, and incorporate relativistic standard loss.
arXiv Detail & Related papers (2024-09-02T03:18:56Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z) - Structuring Concept Space with the Musical Circle of Fifths by Utilizing Music Grammar Based Activations [0.0]
We explore the intriguing similarities between the structure of a discrete neural network, such as a spiking network, and the composition of a piano piece.
We propose a novel approach that leverages musical grammar to regulate activations in a spiking neural network.
We show that the map of concepts in our model is structured by the musical circle of fifths, highlighting the potential for leveraging music theory principles in deep learning algorithms.
arXiv Detail & Related papers (2024-02-22T03:28:25Z) - Structure-informed Positional Encoding for Music Generation [0.0]
We propose a structure-informed positional encoding framework for music generation with Transformers.
We test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation.
Our methods improve the melodic and structural consistency of the generated pieces.
arXiv Detail & Related papers (2024-02-20T13:41:35Z) - Graph-based Polyphonic Multitrack Music Generation [9.701208207491879]
This paper introduces a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately.
By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times.
arXiv Detail & Related papers (2023-07-27T15:18:50Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - MusCaps: Generating Captions for Music Audio [14.335950077921435]
We present the first music audio captioning model, MusCaps, consisting of an encoder-decoder with temporal attention.
Our method combines convolutional and recurrent neural network architectures to jointly process audio-text inputs.
Our model represents a shift away from classification-based music description and combines tasks requiring both auditory and linguistic understanding.
arXiv Detail & Related papers (2021-04-24T16:34:47Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.