Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
- URL: http://arxiv.org/abs/2109.06441v1
- Date: Tue, 14 Sep 2021 05:04:13 GMT
- Title: Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
- Authors: Xueyao Zhang, Jinchao Zhang, Yao Qiu, Li Wang, Jie Zhou
- Abstract summary: We propose to leverage harmony-aware learning for structure-enhanced pop music generation.
Results of subjective and objective evaluations demonstrate that Harmony-Aware Hierarchical Music Transformer (HAT) significantly improves the quality of generated music.
- Score: 20.06867705303102
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automatically composing pop music with a satisfactory structure is an
attractive but challenging topic. Although the musical structure is easy to be
perceived by human, it is difficult to be described clearly and defined
accurately. And it is still far from being solved that how we should model the
structure in pop music generation. In this paper, we propose to leverage
harmony-aware learning for structure-enhanced pop music generation. On the one
hand, one of the participants of harmony, chord, represents the harmonic set of
multiple notes, which is integrated closely with the spatial structure of
music, texture. On the other hand, the other participant of harmony, chord
progression, usually accompanies with the development of the music, which
promotes the temporal structure of music, form. Besides, when chords evolve
into chord progression, the texture and the form can be bridged by the harmony
naturally, which contributes to the joint learning of the two structures.
Furthermore, we propose the Harmony-Aware Hierarchical Music Transformer (HAT),
which can exploit the structure adaptively from the music, and interact on the
music tokens at multiple levels to enhance the signals of the structure in
various musical elements. Results of subjective and objective evaluations
demonstrate that HAT significantly improves the quality of generated music,
especially in the structureness.
Related papers
- A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding.
We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z) - ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z) - Graph-based Polyphonic Multitrack Music Generation [9.701208207491879]
This paper introduces a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately.
By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times.
arXiv Detail & Related papers (2023-07-27T15:18:50Z) - WuYun: Exploring hierarchical skeleton-guided melody generation using
knowledge-enhanced deep learning [26.515527387450636]
WuYun is a knowledge-enhanced deep learning architecture for improving structure of generated melodies.
We use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them.
We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average.
arXiv Detail & Related papers (2023-01-11T14:33:42Z) - MeloForm: Generating Melody with Musical Form based on Expert Systems
and Neural Networks [146.59245563763065]
MeloForm is a system that generates melody with musical form using expert systems and neural networks.
It can support various kinds of forms, such as verse and chorus form, rondo form, variational form, sonata form, etc.
arXiv Detail & Related papers (2022-08-30T15:44:15Z) - A-Muze-Net: Music Generation by Composing the Harmony based on the
Generated Melody [91.22679787578438]
We present a method for the generation of Midi files of piano music.
The method models the right and left hands using two networks, where the left hand is conditioned on the right hand.
The Midi is represented in a way that is invariant to the musical scale, and the melody is represented, for the purpose of conditioning the harmony.
arXiv Detail & Related papers (2021-11-25T09:45:53Z) - Controllable deep melody generation via hierarchical music structure
representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody.
To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks.
To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Music Harmony Generation, through Deep Learning and Using a
Multi-Objective Evolutionary Algorithm [0.0]
This paper introduces a genetic multi-objective evolutionary optimization algorithm for the generation of polyphonic music.
One of the goals is the rules and regulations of music, which, along with the other two goals, including the scores of music experts and ordinary listeners, fits the cycle of evolution to get the most optimal response.
The results show that the proposed method is able to generate difficult and pleasant pieces with desired styles and lengths, along with harmonic sounds that follow the grammar while attracting the listener, at the same time.
arXiv Detail & Related papers (2021-02-16T05:05:54Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Structural characterization of musical harmonies [4.416484585765029]
We use a hybrid method in which an evidence-gathering numerical method detects modulation and then, based on the detected tonalities, a non-ambiguous grammar can be used for analyzing the structure of each tonal component.
Experiments with music from the XVII and XVIII centuries show that we can detect the precise point of modulation with an error of at most two chords in almost 97% of the cases.
arXiv Detail & Related papers (2019-12-27T23:15:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.