Personalized Popular Music Generation Using Imitation and Structure
- URL: http://arxiv.org/abs/2105.04709v1
- Date: Mon, 10 May 2021 23:43:00 GMT
- Title: Personalized Popular Music Generation Using Imitation and Structure
- Authors: Shuqi Dai, Xichu Ma, Ye Wang, Roger B. Dannenberg
- Abstract summary: We propose a statistical machine learning model that is able to capture and imitate the structure, melody, chord, and bass style from a given example seed song.
An evaluation using 10 pop songs shows that our new representations and methods are able to create high-quality stylistic music.
- Score: 1.971709238332434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many practices have been presented in music generation recently. While
stylistic music generation using deep learning techniques has became the main
stream, these models still struggle to generate music with high musicality,
different levels of music structure, and controllability. In addition, more
application scenarios such as music therapy require imitating more specific
musical styles from a few given music examples, rather than capturing the
overall genre style of a large data corpus. To address requirements that
challenge current deep learning methods, we propose a statistical machine
learning model that is able to capture and imitate the structure, melody,
chord, and bass style from a given example seed song. An evaluation using 10
pop songs shows that our new representations and methods are able to create
high-quality stylistic music that is similar to a given input song. We also
discuss potential uses of our approach in music evaluation and music therapy.
Related papers
- MusicFlow: Cascaded Flow Matching for Text Guided Music Generation [53.63948108922333]
MusicFlow is a cascaded text-to-music generation model based on flow matching.
We leverage masked prediction as the training objective, enabling the model to generalize to other tasks such as music infilling and continuation.
arXiv Detail & Related papers (2024-10-27T15:35:41Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos.
Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z) - Evaluating Deep Music Generation Methods Using Data Augmentation [13.72212417973239]
We focus on a homogeneous, objective framework for evaluating samples of algorithmically generated music.
We do not seek to assess the musical merit of generated music, but instead explore whether generated samples contain meaningful information pertaining to emotion or mood/theme.
arXiv Detail & Related papers (2021-12-31T20:35:46Z) - Multi-task Learning with Metadata for Music Mood Classification [0.0]
Mood recognition is an important problem in music informatics and has key applications in music discovery and recommendation.
We propose a multi-task learning approach in which a shared model is simultaneously trained for mood and metadata prediction tasks.
Applying our technique on the existing state-of-the-art convolutional neural networks for mood classification improves their performances consistently.
arXiv Detail & Related papers (2021-10-10T11:36:34Z) - Controllable deep melody generation via hierarchical music structure
representation [14.891975420982511]
MusicFrameworks is a hierarchical music structure representation and a multi-step generative process to create a full-length melody.
To generate melody in each phrase, we generate rhythm and basic melody using two separate transformer-based networks.
To customize or add variety, one can alter chords, basic melody, and rhythm structure in the music frameworks, letting our networks generate the melody accordingly.
arXiv Detail & Related papers (2021-09-02T01:31:14Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - A Comprehensive Survey on Deep Music Generation: Multi-level
Representations, Algorithms, Evaluations, and Future Directions [10.179835761549471]
This paper attempts to provide an overview of various composition tasks under different music generation levels using deep learning.
In addition, we summarize datasets suitable for diverse tasks, discuss the music representations, the evaluation methods as well as the challenges under different levels, and finally point out several future directions.
arXiv Detail & Related papers (2020-11-13T08:01:20Z) - Incorporating Music Knowledge in Continual Dataset Augmentation for
Music Generation [69.06413031969674]
Aug-Gen is a method of dataset augmentation for any music generation system trained on a resource-constrained domain.
We apply Aug-Gen to Transformer-based chorale generation in the style of J.S. Bach, and show that this allows for longer training and results in better generative output.
arXiv Detail & Related papers (2020-06-23T21:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.