WuYun: Exploring hierarchical skeleton-guided melody generation using
knowledge-enhanced deep learning
- URL: http://arxiv.org/abs/2301.04488v1
- Date: Wed, 11 Jan 2023 14:33:42 GMT
- Title: WuYun: Exploring hierarchical skeleton-guided melody generation using
knowledge-enhanced deep learning
- Authors: Kejun Zhang, Xinda Wu, Tieyao Zhang, Zhijie Huang, Xu Tan, Qihao
Liang, Songruoyao Wu, and Lingyun Sun
- Abstract summary: WuYun is a knowledge-enhanced deep learning architecture for improving structure of generated melodies.
We use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them.
We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average.
- Score: 26.515527387450636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep learning has revolutionized music generation, existing methods
for structured melody generation follow an end-to-end left-to-right
note-by-note generative paradigm and treat each note equally. Here, we present
WuYun, a knowledge-enhanced deep learning architecture for improving the
structure of generated melodies, which first generates the most structurally
important notes to construct a melodic skeleton and subsequently infills it
with dynamically decorative notes into a full-fledged melody. Specifically, we
use music domain knowledge to extract melodic skeletons and employ sequence
learning to reconstruct them, which serve as additional knowledge to provide
auxiliary guidance for the melody generation process. We demonstrate that WuYun
can generate melodies with better long-term structure and musicality and
outperforms other state-of-the-art methods by 0.51 on average on all subjective
evaluation metrics. Our study provides a multidisciplinary lens to design
melodic hierarchical structures and bridge the gap between data-driven and
knowledge-based approaches for numerous music generation tasks.
Related papers
- Structure-informed Positional Encoding for Music Generation [0.0]
We propose a structure-informed positional encoding framework for music generation with Transformers.
We test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation.
Our methods improve the melodic and structural consistency of the generated pieces.
arXiv Detail & Related papers (2024-02-20T13:41:35Z) - MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation [39.892059799407434]
MelodyGLM is a multi-task pre-training framework for generating melodies with long-term structure.
We have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces.
arXiv Detail & Related papers (2023-09-19T16:34:24Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - MeloForm: Generating Melody with Musical Form based on Expert Systems
and Neural Networks [146.59245563763065]
MeloForm is a system that generates melody with musical form using expert systems and neural networks.
It can support various kinds of forms, such as verse and chorus form, rondo form, variational form, sonata form, etc.
arXiv Detail & Related papers (2022-08-30T15:44:15Z) - Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z) - Structure-Enhanced Pop Music Generation via Harmony-Aware Learning [20.06867705303102]
We propose to leverage harmony-aware learning for structure-enhanced pop music generation.
Results of subjective and objective evaluations demonstrate that Harmony-Aware Hierarchical Music Transformer (HAT) significantly improves the quality of generated music.
arXiv Detail & Related papers (2021-09-14T05:04:13Z) - Controllable deep melody generation via hierarchical music structure
representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody.
To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks.
To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z) - Music Generation with Temporal Structure Augmentation [0.0]
The proposed method augments a connectionist generation model with count-down to song conclusion and meter markers as extra input features.
An RNN architecture with LSTM cells is trained on the Nottingham folk music dataset in a supervised sequence learning setup.
Experiments show an improved prediction performance for both types of annotation.
arXiv Detail & Related papers (2020-04-21T19:19:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.