Multi-Genre Music Transformer -- Composing Full Length Musical Piece
- URL: http://arxiv.org/abs/2301.02385v1
- Date: Fri, 6 Jan 2023 05:27:55 GMT
- Title: Multi-Genre Music Transformer -- Composing Full Length Musical Piece
- Authors: Abhinav Kaushal Keshari
- Abstract summary: The objective of the project is to implement a Multi-Genre Transformer which learns to produce music pieces through more adaptive learning process.
We built a multi-genre compound word dataset, implemented a linear transformer which was trained on this dataset.
We call this Multi-Genre Transformer, which was able to generate full length new musical pieces which is diverse and comparable to original tracks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the task of generating music, the art factor plays a big role and is a
great challenge for AI. Previous work involving adversarial training to produce
new music pieces and modeling the compatibility of variety in music (beats,
tempo, musical stems) demonstrated great examples of learning this task. Though
this was limited to generating mashups or learning features from tempo and key
distributions to produce similar patterns. Compound Word Transformer was able
to represent music generation task as a sequence generation challenge involving
musical events defined by compound words. These musical events give a more
accurate description of notes progression, chord change, harmony and the art
factor. The objective of the project is to implement a Multi-Genre Transformer
which learns to produce music pieces through more adaptive learning process
involving more challenging task where genres or form of the composition is also
considered. We built a multi-genre compound word dataset, implemented a linear
transformer which was trained on this dataset. We call this Multi-Genre
Transformer, which was able to generate full length new musical pieces which is
diverse and comparable to original tracks. The model trains 2-5 times faster
than other models discussed.
Related papers
- SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation [75.86473375730392]
SongGen is a fully open-source, single-stage auto-regressive transformer for controllable song generation.
It supports two output modes: mixed mode, which generates a mixture of vocals and accompaniment directly, and dual-track mode, which synthesizes them separately.
To foster community engagement and future research, we will release our model weights, training code, annotated data, and preprocessing pipeline.
arXiv Detail & Related papers (2025-02-18T18:52:21Z) - ImprovNet: Generating Controllable Musical Improvisations with Iterative Corruption Refinement [6.873190001575463]
ImprovNet is a transformer-based architecture that generates expressive and controllable musical improvisations.
It can perform cross-genre and intra-genre improvisations, harmonize melodies with genre-specific styles, and execute short prompt continuation and infilling tasks.
arXiv Detail & Related papers (2025-02-06T21:45:38Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - Generating music with sentiment using Transformer-GANs [0.0]
We propose a generative model of symbolic music conditioned by data retrieved from human sentiment.
We try to tackle both of the problems above by employing an efficient linear version of Attention and using a Discriminator.
arXiv Detail & Related papers (2022-12-21T15:59:35Z) - Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos.
Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z) - Calliope -- A Polyphonic Music Transformer [9.558051115598657]
We present Calliope, a novel autoencoder model based on Transformers for the efficient modelling of multi-track sequences of polyphonic music.
Experiments show that our model is able to improve the state of the art on musical sequence reconstruction and generation.
arXiv Detail & Related papers (2021-07-08T08:18:57Z) - MuseMorphose: Full-Song and Fine-Grained Music Style Transfer with Just
One Transformer VAE [36.9033909878202]
Transformer and variational autoencoders (VAE) have been extensively employed for symbolic (e.g., MIDI) domain music generation.
In this paper, we are interested in bringing the two together to construct a single model that exhibits both strengths.
Experiments show that MuseMorphose outperforms recurrent neural network (RNN) based prior art on numerous widely-used metrics for style transfer tasks.
arXiv Detail & Related papers (2021-05-10T03:44:03Z) - A Comprehensive Survey on Deep Music Generation: Multi-level
Representations, Algorithms, Evaluations, and Future Directions [10.179835761549471]
This paper attempts to provide an overview of various composition tasks under different music generation levels using deep learning.
In addition, we summarize datasets suitable for diverse tasks, discuss the music representations, the evaluation methods as well as the challenges under different levels, and finally point out several future directions.
arXiv Detail & Related papers (2020-11-13T08:01:20Z) - Incorporating Music Knowledge in Continual Dataset Augmentation for
Music Generation [69.06413031969674]
Aug-Gen is a method of dataset augmentation for any music generation system trained on a resource-constrained domain.
We apply Aug-Gen to Transformer-based chorale generation in the style of J.S. Bach, and show that this allows for longer training and results in better generative output.
arXiv Detail & Related papers (2020-06-23T21:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.