Setting the rhythm scene: deep learning-based drum loop generation from
  arbitrary language cues
        - URL: http://arxiv.org/abs/2209.10016v1
- Date: Tue, 20 Sep 2022 21:53:35 GMT
- Title: Setting the rhythm scene: deep learning-based drum loop generation from
  arbitrary language cues
- Authors: Ignacio J. Tripodi
- Abstract summary: We present a novel method that generates 2 compasses of a 4-piece drum pattern that embodies the "mood" of a language cue.
We envision this tool as composition aid for electronic music and audiovisual soundtrack production, or an improvisation tool for live performance.
In order to produce the training samples for this model, besides manual annotation of the "scene" or "mood" terms, we have designed a novel method to extract the consensus drum track of any song.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract:   Generative artificial intelligence models can be a valuable aid to music
composition and live performance, both to aid the professional musician and to
help democratize the music creation process for hobbyists. Here we present a
novel method that, given an English word or phrase, generates 2 compasses of a
4-piece drum pattern that embodies the "mood" of the given language cue, or
that could be used for an audiovisual scene described by the language cue. We
envision this tool as composition aid for electronic music and audiovisual
soundtrack production, or an improvisation tool for live performance. In order
to produce the training samples for this model, besides manual annotation of
the "scene" or "mood" terms, we have designed a novel method to extract the
consensus drum track of any song. This consists of a 2-bar, 4-piece drum
pattern that represents the main percussive motif of a song, which could be
imported into any music loop device or live looping software. These two key
components (drum pattern generation from a generalizable input, and consensus
percussion extraction) present a novel approach to computer-aided composition
and provide a stepping stone for more comprehensive rhythm generation.
 
      
        Related papers
        - Apollo: An Interactive Environment for Generating Symbolic Musical   Phrases using Corpus-based Style Imitation [5.649205001069577]
 We introduce Apollo, an interactive music application for generating symbolic phrases of conventional western music.
The system makes it possible for music artists and researchers to generate new musical phrases in the style of the proposed corpus.
The generated symbolic music materials, encoded in the MIDI format, can be exported or streamed for various purposes.
 arXiv  Detail & Related papers  (2025-04-18T19:53:51Z)
- Interpreting Graphic Notation with MusicLDM: An AI Improvisation of   Cornelius Cardew's Treatise [4.9485163144728235]
 This work presents a novel method for composing and improvising music inspired by Cornelius Cardew's Treatise.
By leveraging OpenAI's ChatGPT to interpret the abstract visual elements of Treatise, we convert these graphical images into descriptive textual prompts.
These prompts are then input into MusicLDM, a pre-trained latent diffusion model designed for music generation.
 arXiv  Detail & Related papers  (2024-12-12T05:08:36Z)
- MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic   Synchronization [52.498942604622165]
 This paper presents MuVi, a framework to generate music that aligns with video content.
MuVi analyzes video content through a specially designed visual adaptor to extract contextually and temporally relevant features.
We show that MuVi demonstrates superior performance in both audio quality and temporal synchronization.
 arXiv  Detail & Related papers  (2024-10-16T18:44:56Z)
- SongCreator: Lyrics-based Universal Song Generation [53.248473603201916]
 SongCreator is a song-generation system designed to tackle the challenge of generating songs with both vocals and accompaniment given lyrics.
The model features two novel designs: a meticulously designed dual-sequence language model (M) to capture the information of vocals and accompaniment for song generation, and a series of attention mask strategies for DSLM.
Experiments demonstrate the effectiveness of SongCreator by achieving state-of-the-art or competitive performances on all eight tasks.
 arXiv  Detail & Related papers  (2024-09-09T19:37:07Z)
- Subtractive Training for Music Stem Insertion using Latent Diffusion   Models [35.91945598575059]
 We present Subtractive Training, a method for synthesizing individual musical instrument stems given other instruments as context.
Our results demonstrate Subtractive Training's efficacy in creating authentic drum stems that seamlessly blend with the existing tracks.
We extend this technique to MIDI formats, successfully generating compatible bass, drum, and guitar parts for incomplete arrangements.
 arXiv  Detail & Related papers  (2024-06-27T16:59:14Z)
- MeLFusion: Synthesizing Music from Image and Language Cues using   Diffusion Models [57.47799823804519]
 We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
 arXiv  Detail & Related papers  (2024-06-07T06:38:59Z)
- SongComposer: A Large Language Model for Lyric and Melody Generation in   Song Composition [82.38021790213752]
 SongComposer is a music-specialized large language model (LLM)<n>It integrates the capability of simultaneously composing melodies into LLMs by leveraging three key innovations.<n>It outperforms advanced LLMs in tasks such as lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation.<n>We will release SongCompose, a large-scale dataset for training, containing paired lyrics and melodies in Chinese and English.
 arXiv  Detail & Related papers  (2024-02-27T16:15:28Z)
- Language-Guided Music Recommendation for Video via Prompt Analogies [35.48998901411509]
 We propose a method to recommend music for an input video while allowing a user to guide music selection with free-form natural language.
Existing music video datasets provide the needed (video, music) training pairs, but lack text descriptions of the music.
 arXiv  Detail & Related papers  (2023-06-15T17:58:01Z)
- Noise2Music: Text-conditioned Music Generation with Diffusion Models [73.74580231353684]
 We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts.
We find that the generated audio is not only able to faithfully reflect key elements of the text prompt such as genre, tempo, instruments, mood, and era.
Pretrained large language models play a key role in this story -- they are used to generate paired text for the audio of the training set and to extract embeddings of the text prompts ingested by the diffusion models.
 arXiv  Detail & Related papers  (2023-02-08T07:27:27Z)
- Generating Coherent Drum Accompaniment With Fills And Improvisations [8.334918207379172]
 We tackle the task of drum pattern generation conditioned on the accompanying music played by four melodic instruments.
We propose a novelty function to capture the extent of improvisation in a bar relative to its neighbors.
We train a model to predict improvisation locations from the melodic accompaniment tracks.
 arXiv  Detail & Related papers  (2022-09-01T08:31:26Z)
- Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
 Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
 arXiv  Detail & Related papers  (2022-08-11T08:44:47Z)
- Towards Automatic Instrumentation by Learning to Separate Parts in
  Symbolic Multitrack Music [33.679951600368405]
 We study the feasibility of automatic instrumentation -- dynamically assigning instruments to notes in solo music during performance.
In addition to the online, real-time-capable setting for performative use cases, automatic instrumentation can also find applications in assistive composing tools in an offline setting.
We frame the task of part separation as a sequential multi-class classification problem and adopt machine learning to map sequences of notes into sequences of part labels.
 arXiv  Detail & Related papers  (2021-07-13T08:34:44Z)
- Artificial Neural Networks Jamming on the Beat [20.737171876839238]
 The paper presents a large dataset of drum patterns alongside with corresponding melodies.
 exploring a latent space of drum patterns one could generate new drum patterns with a given music style.
A simple artificial neural network could be trained to generate melodies corresponding with these drum patters used as inputs.
 arXiv  Detail & Related papers  (2020-07-13T10:09:20Z)
- Music Gesture for Visual Sound Separation [121.36275456396075]
 "Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
 arXiv  Detail & Related papers  (2020-04-20T17:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.