Notochord: a Flexible Probabilistic Model for Real-Time MIDI Performance
- URL: http://arxiv.org/abs/2403.12000v1
- Date: Mon, 18 Mar 2024 17:35:02 GMT
- Title: Notochord: a Flexible Probabilistic Model for Real-Time MIDI Performance
- Authors: Victor Shepardson, Jack Armitage, Thor Magnusson,
- Abstract summary: Notochord is a deep probabilistic model for sequences of structured events.
It can generate polyphonic and multi-track MIDI, and respond to inputs with latency below ten milliseconds.
- Score: 0.8192907805418583
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep learning-based probabilistic models of musical data are producing increasingly realistic results and promise to enter creative workflows of many kinds. Yet they have been little-studied in a performance setting, where the results of user actions typically ought to feel instantaneous. To enable such study, we designed Notochord, a deep probabilistic model for sequences of structured events, and trained an instance of it on the Lakh MIDI dataset. Our probabilistic formulation allows interpretable interventions at a sub-event level, which enables one model to act as a backbone for diverse interactive musical functions including steerable generation, harmonization, machine improvisation, and likelihood-based interfaces. Notochord can generate polyphonic and multi-track MIDI, and respond to inputs with latency below ten milliseconds. Training code, model checkpoints and interactive examples are provided as open source software.
Related papers
- Expressive MIDI-format Piano Performance Generation [4.549093083765949]
This work presents a generative neural network that's able to generate expressive piano performance in MIDI format.
The musical expressivity is reflected by vivid micro-timing, rich polyphonic texture, varied dynamics, and the sustain pedal effects.
arXiv Detail & Related papers (2024-08-01T20:36:37Z) - Generative Pre-training for Speech with Flow Matching [81.59952572752248]
We pre-trained a generative model, named SpeechFlow, on 60k hours of untranscribed speech with Flow Matching and masked conditions.
Experiment results show the pre-trained generative model can be fine-tuned with task-specific data to match or surpass existing expert models on speech enhancement, separation, and synthesis.
arXiv Detail & Related papers (2023-10-25T03:40:50Z) - Anticipatory Music Transformer [60.15347393822849]
We introduce anticipation: a method for constructing a controllable generative model of a temporal point process.
We focus on infilling control tasks, whereby the controls are a subset of the events themselves.
We train anticipatory infilling models using the large and diverse Lakh MIDI music dataset.
arXiv Detail & Related papers (2023-06-14T16:27:53Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Comparision Of Adversarial And Non-Adversarial LSTM Music Generative
Models [2.569647910019739]
This work implements and compares adversarial and non-adversarial training of recurrent neural network music composers on MIDI data.
The evaluation indicates that adversarial training produces more aesthetically pleasing music.
arXiv Detail & Related papers (2022-11-01T20:23:49Z) - Multi-instrument Music Synthesis with Spectrogram Diffusion [19.81982315173444]
We focus on a middle ground of neural synthesizers that can generate audio from MIDI sequences with arbitrary combinations of instruments in realtime.
We use a simple two-stage process: MIDI to spectrograms with an encoder-decoder Transformer, then spectrograms to audio with a generative adversarial network (GAN) spectrogram inverter.
We find this to be a promising first step towards interactive and expressive neural synthesis for arbitrary combinations of instruments and notes.
arXiv Detail & Related papers (2022-06-11T03:26:15Z) - MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical
Modeling [6.256118777336895]
Musical expression requires control of both what notes are played, and how they are performed.
We introduce MIDI-DDSP, a hierarchical model of musical instruments that enables both realistic neural audio synthesis and detailed user control.
We demonstrate that this hierarchy can reconstruct high-fidelity audio, accurately predict performance attributes for a note sequence, independently manipulate the attributes of a given performance, and as a complete system, generate realistic audio from a novel note sequence.
arXiv Detail & Related papers (2021-12-17T04:15:42Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Score-informed Networks for Music Performance Assessment [64.12728872707446]
Deep neural network-based methods incorporating score information into MPA models have not yet been investigated.
We introduce three different models capable of score-informed performance assessment.
arXiv Detail & Related papers (2020-08-01T07:46:24Z) - Generative Modelling for Controllable Audio Synthesis of Expressive
Piano Performance [6.531546527140474]
controllable neural audio synthesizer based on Gaussian Mixture Variational Autoencoders (GM-VAE)
We demonstrate how the model is able to apply fine-grained style morphing over the course of the audio.
arXiv Detail & Related papers (2020-06-16T12:54:41Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.