Choir Transformer: Generating Polyphonic Music with Relative Attention
on Transformer
- URL: http://arxiv.org/abs/2308.02531v1
- Date: Tue, 1 Aug 2023 06:44:15 GMT
- Title: Choir Transformer: Generating Polyphonic Music with Relative Attention
on Transformer
- Authors: Jiuyang Zhou, Hong Zhu, Xingping Wang
- Abstract summary: We propose a polyphonic music generation neural network named Choir Transformer.
The performance of Choir Transformer surpasses the previous state-of-the-art accuracy of 4.06%.
In practical application, the generated melody and rhythm can be adjusted according to the specified input.
- Score: 4.866650264773479
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Polyphonic music generation is still a challenge direction due to its correct
between generating melody and harmony. Most of the previous studies used
RNN-based models. However, the RNN-based models are hard to establish the
relationship between long-distance notes. In this paper, we propose a
polyphonic music generation neural network named Choir Transformer[
https://github.com/Zjy0401/choir-transformer], with relative positional
attention to better model the structure of music. We also proposed a music
representation suitable for polyphonic music generation. The performance of
Choir Transformer surpasses the previous state-of-the-art accuracy of 4.06%. We
also measures the harmony metrics of polyphonic music. Experiments show that
the harmony metrics are close to the music of Bach. In practical application,
the generated melody and rhythm can be adjusted according to the specified
input, with different styles of music like folk music or pop music and so on.
Related papers
- Music102: An $D_{12}$-equivariant transformer for chord progression accompaniment [0.0]
Music102 enhances chord progression accompaniment through a D12-equivariant transformer.
By encoding prior music knowledge, the model maintains equivariance across both melody and chord sequences.
This work showcases the adaptability of self-attention mechanisms and layer normalization to the discrete musical domain.
arXiv Detail & Related papers (2024-10-23T03:11:01Z) - Do we need more complex representations for structure? A comparison of note duration representation for Music Transformers [0.0]
In this work, we inquire if the off-the-shelf Music Transformer models perform just as well on structural similarity metrics using only unannotated MIDI information.
We show that a slight tweak to the most common representation yields small but significant improvements.
arXiv Detail & Related papers (2024-10-14T13:53:11Z) - MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - Melody transcription via generative pre-training [86.08508957229348]
Key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles.
To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio.
We derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music.
arXiv Detail & Related papers (2022-12-04T18:09:23Z) - Museformer: Transformer with Fine- and Coarse-Grained Attention for
Music Generation [138.74751744348274]
We propose Museformer, a Transformer with a novel fine- and coarse-grained attention for music generation.
Specifically, with the fine-grained attention, a token of a specific bar directly attends to all the tokens of the bars that are most relevant to music structures.
With the coarse-grained attention, a token only attends to the summarization of the other bars rather than each token of them so as to reduce the computational cost.
arXiv Detail & Related papers (2022-10-19T07:31:56Z) - Chord-Conditioned Melody Choralization with Controllable Harmonicity and
Polyphonicity [75.02344976811062]
Melody choralization, i.e. generating a four-part chorale based on a user-given melody, has long been closely associated with J.S. Bach chorales.
Previous neural network-based systems rarely focus on chorale generation conditioned on a chord progression.
We propose DeepChoir, a melody choralization system, which can generate a four-part chorale for a given melody conditioned on a chord progression.
arXiv Detail & Related papers (2022-02-17T02:59:36Z) - Calliope -- A Polyphonic Music Transformer [9.558051115598657]
We present Calliope, a novel autoencoder model based on Transformers for the efficient modelling of multi-track sequences of polyphonic music.
Experiments show that our model is able to improve the state of the art on musical sequence reconstruction and generation.
arXiv Detail & Related papers (2021-07-08T08:18:57Z) - MuseMorphose: Full-Song and Fine-Grained Music Style Transfer with Just
One Transformer VAE [36.9033909878202]
Transformer and variational autoencoders (VAE) have been extensively employed for symbolic (e.g., MIDI) domain music generation.
In this paper, we are interested in bringing the two together to construct a single model that exhibits both strengths.
Experiments show that MuseMorphose outperforms recurrent neural network (RNN) based prior art on numerous widely-used metrics for style transfer tasks.
arXiv Detail & Related papers (2021-05-10T03:44:03Z) - PopMAG: Pop Music Accompaniment Generation [190.09996798215738]
We propose a novel MUlti-track MIDI representation (MuMIDI) which enables simultaneous multi-track generation in a single sequence.
MuMIDI enlarges the sequence length and brings the new challenge of long-term music modeling.
We call our system for pop music accompaniment generation as PopMAG.
arXiv Detail & Related papers (2020-08-18T02:28:36Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.