CoCoFormer: A controllable feature-rich polyphonic music generation
method
- URL: http://arxiv.org/abs/2310.09843v2
- Date: Tue, 28 Nov 2023 03:30:44 GMT
- Title: CoCoFormer: A controllable feature-rich polyphonic music generation
method
- Authors: Jiuyang Zhou, Tengfei Niu, Hong Zhu, Xingping Wang
- Abstract summary: This paper proposes Condition Choir Transformer (CoCoFormer) which controls the output of the model by controlling the chord and rhythm inputs at a fine-grained level.
In this paper, the experiments proves that CoCoFormer has reached the current better level than current models.
- Score: 2.501600004190393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the modeling method of polyphonic music sequence. Due to
the great potential of Transformer models in music generation, controllable
music generation is receiving more attention. In the task of polyphonic music,
current controllable generation research focuses on controlling the generation
of chords, but lacks precise adjustment for the controllable generation of
choral music textures. This paper proposed Condition Choir Transformer
(CoCoFormer) which controls the output of the model by controlling the chord
and rhythm inputs at a fine-grained level. In this paper, the self-supervised
method improves the loss function and performs joint training through
conditional control input and unconditional input training. In order to
alleviate the lack of diversity on generated samples caused by the teacher
forcing training, this paper added an adversarial training method. CoCoFormer
enhances model performance with explicit and implicit inputs to chords and
rhythms. In this paper, the experiments proves that CoCoFormer has reached the
current better level than current models. On the premise of specifying the
polyphonic music texture, the same melody can also be generated in a variety of
ways.
Related papers
- An End-to-End Approach for Chord-Conditioned Song Generation [14.951089833579063]
Song Generation task aims to synthesize music composed of vocals and accompaniment from given lyrics.
To mitigate the issue, we introduce an important concept from music composition, namely chords to song generation networks.
We propose a novel model termed Chord-Conditioned Song Generator (CSG) based on it.
arXiv Detail & Related papers (2024-09-10T08:07:43Z) - MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - Polyffusion: A Diffusion Model for Polyphonic Score Generation with
Internal and External Controls [5.597394612661976]
Polyffusion is a diffusion model that generates polyphonic music scores by regarding music as image-like piano roll representations.
We show that by using internal and external controls, Polyffusion unifies a wide range of music creation tasks.
arXiv Detail & Related papers (2023-07-19T06:36:31Z) - Anticipatory Music Transformer [60.15347393822849]
We introduce anticipation: a method for constructing a controllable generative model of a temporal point process.
We focus on infilling control tasks, whereby the controls are a subset of the events themselves.
We train anticipatory infilling models using the large and diverse Lakh MIDI music dataset.
arXiv Detail & Related papers (2023-06-14T16:27:53Z) - Is Disentanglement enough? On Latent Representations for Controllable
Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z) - Learning Interpretable Representation for Controllable Polyphonic Music
Generation [5.01266258109807]
We design a novel architecture, that effectively learns two interpretable latent factors of polyphonic music: chord and texture.
We show that such chord-texture disentanglement provides a controllable generation pathway leading to a wide spectrum of applications.
arXiv Detail & Related papers (2020-08-17T07:11:16Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - Incorporating Music Knowledge in Continual Dataset Augmentation for
Music Generation [69.06413031969674]
Aug-Gen is a method of dataset augmentation for any music generation system trained on a resource-constrained domain.
We apply Aug-Gen to Transformer-based chorale generation in the style of J.S. Bach, and show that this allows for longer training and results in better generative output.
arXiv Detail & Related papers (2020-06-23T21:06:15Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.