Generating Lead Sheets with Affect: A Novel Conditional seq2seq
Framework
- URL: http://arxiv.org/abs/2104.13056v1
- Date: Tue, 27 Apr 2021 09:04:21 GMT
- Title: Generating Lead Sheets with Affect: A Novel Conditional seq2seq
Framework
- Authors: Dimos Makris, Kat R. Agres, Dorien Herremans
- Abstract summary: We present a novel approach for calculating the positivity or negativity of a chord progression within a lead sheet.
Our approach is similar to a Neural Machine Translation (NMT) problem, as we include high-level conditions in the encoder part of the sequence-to-sequence architectures.
The proposed strategy is able to generate lead sheets in a controllable manner, resulting in distributions of musical attributes similar to those of the training dataset.
- Score: 3.029434408969759
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The field of automatic music composition has seen great progress in the last
few years, much of which can be attributed to advances in deep neural networks.
There are numerous studies that present different strategies for generating
sheet music from scratch. The inclusion of high-level musical characteristics
(e.g., perceived emotional qualities), however, as conditions for controlling
the generation output remains a challenge. In this paper, we present a novel
approach for calculating the valence (the positivity or negativity of the
perceived emotion) of a chord progression within a lead sheet, using
pre-defined mood tags proposed by music experts. Based on this approach, we
propose a novel strategy for conditional lead sheet generation that allows us
to steer the music generation in terms of valence, phrasing, and time
signature. Our approach is similar to a Neural Machine Translation (NMT)
problem, as we include high-level conditions in the encoder part of the
sequence-to-sequence architectures used (i.e., long-short term memory networks,
and a Transformer network). We conducted experiments to thoroughly analyze
these two architectures. The results show that the proposed strategy is able to
generate lead sheets in a controllable manner, resulting in distributions of
musical attributes similar to those of the training dataset. We also verified
through a subjective listening test that our approach is effective in
controlling the valence of a generated chord progression.
Related papers
- MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance [88.0355290619761]
This work focuses on the separation of unknown musical instruments.
We propose the Separation-with-Consistency (SeCo) framework, which can accomplish the separation on unknown categories.
Our framework exhibits strong adaptation ability on the novel musical categories and outperforms the baseline methods by a significant margin.
arXiv Detail & Related papers (2022-03-25T09:42:11Z) - FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control [25.95359681751144]
We propose the self-supervised description-to-sequence task, which allows for fine-grained controllable generation on a global level.
We do so by extracting high-level features about the target sequence and learning the conditional distribution of sequences given the corresponding high-level description in a sequence-to-sequence modelling setup.
By combining learned high level features with domain knowledge, which acts as a strong inductive bias, the model achieves state-of-the-art results in controllable symbolic music generation and generalizes well beyond the training distribution.
arXiv Detail & Related papers (2022-01-26T13:51:19Z) - Hierarchical Recurrent Neural Networks for Conditional Melody Generation
with Long-term Structure [0.0]
We propose a conditional melody generation model based on a hierarchical recurrent neural network.
This model generates melodies with long-term structures based on given chord accompaniments.
Results from our listening test indicate that CM-HRNN outperforms AttentionRNN in terms of long-term structure and overall rating.
arXiv Detail & Related papers (2021-02-19T08:22:26Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z) - On Long-Tailed Phenomena in Neural Machine Translation [50.65273145888896]
State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens.
We propose a new loss function, the Anti-Focal loss, to better adapt model training to the structural dependencies of conditional text generation.
We show the efficacy of the proposed technique on a number of Machine Translation (MT) datasets, demonstrating that it leads to significant gains over cross-entropy.
arXiv Detail & Related papers (2020-10-10T07:00:57Z) - Learning Interpretable Representation for Controllable Polyphonic Music
Generation [5.01266258109807]
We design a novel architecture, that effectively learns two interpretable latent factors of polyphonic music: chord and texture.
We show that such chord-texture disentanglement provides a controllable generation pathway leading to a wide spectrum of applications.
arXiv Detail & Related papers (2020-08-17T07:11:16Z) - Music FaderNets: Controllable Music Generation Based On High-Level
Features via Low-Level Feature Modelling [5.88864611435337]
We present a framework that can learn high-level feature representations with a limited amount of data.
We refer to our proposed framework as Music FaderNets, which is inspired by the fact that low-level attributes can be continuously manipulated.
We demonstrate that the model successfully learns the intrinsic relationship between arousal and its corresponding low-level attributes.
arXiv Detail & Related papers (2020-07-29T16:01:45Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.