Related papers: Using a Bi-directional LSTM Model with Attention Mechanism trained on MIDI Data for Generating Unique Music

Using a Bi-directional LSTM Model with Attention Mechanism trained on MIDI Data for Generating Unique Music

URL: http://arxiv.org/abs/2011.00773v1
Date: Mon, 2 Nov 2020 06:43:28 GMT
Title: Using a Bi-directional LSTM Model with Attention Mechanism trained on MIDI Data for Generating Unique Music
Authors: Ashish Ranjan, Varun Nagesh Jolly Behera, Motahar Reza
Abstract summary: This paper proposes a bi-directional LSTM model with attention mechanism capable of generating similar type of music based on MIDI data. The music generated by the model follows the theme/style of the music the model is trained on.
Score: 0.25559196081940677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating music is an interesting and challenging problem in the field of machine learning. Mimicking human creativity has been popular in recent years, especially in the field of computer vision and image processing. With the advent of GANs, it is possible to generate new similar images, based on trained data. But this cannot be done for music similarly, as music has an extra temporal dimension. So it is necessary to understand how music is represented in digital form. When building models that perform this generative task, the learning and generation part is done in some high-level representation such as MIDI (Musical Instrument Digital Interface) or scores. This paper proposes a bi-directional LSTM (Long short-term memory) model with attention mechanism capable of generating similar type of music based on MIDI data. The music generated by the model follows the theme/style of the music the model is trained on. Also, due to the nature of MIDI, the tempo, instrument, and other parameters can be defined, and changed, post generation.

Related papers

MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition [4.152843247686306]
MIDI-GPT is a generative system designed for computer-assisted music composition. It supports the infilling of musical material at the track and bar level, and can condition generation on attributes including instrument type, musical style, note density, polyphony level, and note duration. We present experimental results that demonstrate that MIDI-GPT is able to consistently avoid duplicating the musical material it was trained on, generate music that is stylistically similar to the training dataset, and that attribute controls allow enforcing various constraints on the generated material.
arXiv Detail & Related papers (2025-01-28T15:17:36Z)
Do Music Generation Models Encode Music Theory? [10.987131058422742]
We introduce SynTheory, a synthetic MIDI and audio music theory dataset consisting of tempos, time signatures, notes, intervals, scales, chords, and chord progressions concepts. We then propose a framework to probe for these music theory concepts in music foundation models and assess how strongly they encode these concepts within their internal representations. Our findings suggest that music theory concepts are discernible within foundation models and that the degree to which they are detectable varies by model size and layer.
arXiv Detail & Related papers (2024-10-01T17:06:30Z)
Expressive MIDI-format Piano Performance Generation [4.549093083765949]
This work presents a generative neural network that's able to generate expressive piano performance in MIDI format. The musical expressivity is reflected by vivid micro-timing, rich polyphonic texture, varied dynamics, and the sustain pedal effects.
arXiv Detail & Related papers (2024-08-01T20:36:37Z)
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations. We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music. Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z)
MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music. To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation) Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z)
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model [32.801213106782335]
We develop a generative music AI framework, Video2Music, that can match a provided video. In a thorough experiment, we show that our proposed framework can generate music that matches the video content in terms of emotion.
arXiv Detail & Related papers (2023-11-02T03:33:00Z)
Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z)
Learning to Generate Music With Sentiment [1.8275108630751844]
This paper presents a generative Deep Learning model that can be directed to compose music with a given sentiment. Besides music generation, the same model can be used for sentiment analysis of symbolic music.
arXiv Detail & Related papers (2021-03-09T03:16:52Z)
PopMAG: Pop Music Accompaniment Generation [190.09996798215738]
We propose a novel MUlti-track MIDI representation (MuMIDI) which enables simultaneous multi-track generation in a single sequence. MuMIDI enlarges the sequence length and brings the new challenge of long-term music modeling. We call our system for pop music accompaniment generation as PopMAG.
arXiv Detail & Related papers (2020-08-18T02:28:36Z)
Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments. We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings. We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z)
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation. The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.