The Piano Inpainting Application
- URL: http://arxiv.org/abs/2107.05944v1
- Date: Tue, 13 Jul 2021 09:33:11 GMT
- Title: The Piano Inpainting Application
- Authors: Ga\"etan Hadjeres and L\'eopold Crestel
- Abstract summary: generative algorithms are still not widely used by artists due to the limited control they offer, prohibitive inference times or the lack of integration within musicians' generate.
In this work, we present the Piano Inpainting Application (PIA), a generative model focused on inpainting piano performances.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autoregressive models are now capable of generating high-quality minute-long
expressive MIDI piano performances. Even though this progress suggests new
tools to assist music composition, we observe that generative algorithms are
still not widely used by artists due to the limited control they offer,
prohibitive inference times or the lack of integration within musicians'
workflows. In this work, we present the Piano Inpainting Application (PIA), a
generative model focused on inpainting piano performances, as we believe that
this elementary operation (restoring missing parts of a piano performance)
encourages human-machine interaction and opens up new ways to approach music
composition. Our approach relies on an encoder-decoder Linear Transformer
architecture trained on a novel representation for MIDI piano performances
termed Structured MIDI Encoding. By uncovering an interesting synergy between
Linear Transformers and our inpainting task, we are able to efficiently inpaint
contiguous regions of a piano performance, which makes our model suitable for
interactive and responsive A.I.-assisted composition. Finally, we introduce our
freely-available Ableton Live PIA plugin, which allows musicians to smoothly
generate or modify any MIDI clip using PIA within a widely-used professional
Digital Audio Workstation.
Related papers
- Music Proofreading with RefinPaint: Where and How to Modify Compositions given Context [1.0650780147044159]
RefinPaint is an iterative technique that improves the sampling process.
It does this by identifying the weaker music elements using a feedback model.
Experimental results suggest RefinPaint's effectiveness in inpainting and proofreading tasks.
arXiv Detail & Related papers (2024-07-12T08:52:27Z) - MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - Pictures Of MIDI: Controlled Music Generation via Graphical Prompts for Image-Based Diffusion Inpainting [0.0]
This study explores a user-friendly graphical interface enabling the drawing of masked regions for inpainting by an Hourglass Diffusion Transformer (HDiT) model trained on MIDI piano roll images.
We demonstrate that, in addition to inpainting of melodies, accompaniment, and continuations, the use of repainting can help increase note density yielding musical structures closely matching user specifications.
arXiv Detail & Related papers (2024-07-01T17:43:45Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing via Content-based Controls [6.176747724853209]
Large Language Models (LLMs) have shown promise in generating high-quality music, but their focus on autoregressive generation limits their utility in music editing tasks.
We propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme.
Our method integrates frame-level content-based controls, facilitating track-conditioned music refinement and score-conditioned music arrangement.
arXiv Detail & Related papers (2024-02-14T19:00:01Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - RMSSinger: Realistic-Music-Score based Singing Voice Synthesis [56.51475521778443]
RMS-SVS aims to generate high-quality singing voices given realistic music scores with different note types.
We propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input.
In RMSSinger, we introduce word-level modeling to avoid the time-consuming phoneme duration annotation and the complicated phoneme-level mel-note alignment.
arXiv Detail & Related papers (2023-05-18T03:57:51Z) - Flat latent manifolds for music improvisation between human and machine [9.571383193449648]
We consider a music-generating algorithm as a counterpart to a human musician, in a setting where reciprocal improvisation is to lead to new experiences.
In the learned model, we generate novel musical sequences by quantification in latent space.
We provide empirical evidence for our method via a set of experiments on music and we deploy our model for an interactive jam session with a professional drummer.
arXiv Detail & Related papers (2022-02-23T09:00:17Z) - Lets Play Music: Audio-driven Performance Video Generation [58.77609661515749]
We propose a new task named Audio-driven Per-formance Video Generation (APVG)
APVG aims to synthesize the video of a person playing a certain instrument guided by a given music audio clip.
arXiv Detail & Related papers (2020-11-05T03:13:46Z) - Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments.
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.
We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z) - Generative Modelling for Controllable Audio Synthesis of Expressive
Piano Performance [6.531546527140474]
controllable neural audio synthesizer based on Gaussian Mixture Variational Autoencoders (GM-VAE)
We demonstrate how the model is able to apply fine-grained style morphing over the course of the audio.
arXiv Detail & Related papers (2020-06-16T12:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.