SingSong: Generating musical accompaniments from singing
- URL: http://arxiv.org/abs/2301.12662v1
- Date: Mon, 30 Jan 2023 04:53:23 GMT
- Title: SingSong: Generating musical accompaniments from singing
- Authors: Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe
Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil
Zeghidour, Jesse Engel
- Abstract summary: We present SingSong, a system that generates instrumental music to accompany input vocals.
In a pairwise comparison with the same vocal inputs, listeners expressed a significant preference for instrumentals generated by SingSong.
- Score: 35.819589427197464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present SingSong, a system that generates instrumental music to accompany
input vocals, potentially offering musicians and non-musicians alike an
intuitive new way to create music featuring their own voice. To accomplish
this, we build on recent developments in musical source separation and audio
generation. Specifically, we apply a state-of-the-art source separation
algorithm to a large corpus of music audio to produce aligned pairs of vocals
and instrumental sources. Then, we adapt AudioLM (Borsos et al., 2022) -- a
state-of-the-art approach for unconditional audio generation -- to be suitable
for conditional "audio-to-audio" generation tasks, and train it on the
source-separated (vocal, instrumental) pairs. In a pairwise comparison with the
same vocal inputs, listeners expressed a significant preference for
instrumentals generated by SingSong compared to those from a strong retrieval
baseline.
Sound examples at https://g.co/magenta/singsong
Related papers
- Constructing a Singing Style Caption Dataset [12.515874333424929]
We introduce S2Cap, an audio-text pair dataset with a diverse set of attributes.
S2Cap consists of pairs of textual prompts and music audio samples with a wide range of vocal and musical attributes.
We present a novel mechanism called CRESCENDO, which utilizes positive-pair similarity learning to synchronize the embedding spaces of a pretrained audio encoder.
arXiv Detail & Related papers (2024-09-15T21:19:24Z) - SongCreator: Lyrics-based Universal Song Generation [53.248473603201916]
SongCreator is a song-generation system designed to tackle the challenge of generating songs with both vocals and accompaniment given lyrics.
The model features two novel designs: a meticulously designed dual-sequence language model (M) to capture the information of vocals and accompaniment for song generation, and a series of attention mask strategies for DSLM.
Experiments demonstrate the effectiveness of SongCreator by achieving state-of-the-art or competitive performances on all eight tasks.
arXiv Detail & Related papers (2024-09-09T19:37:07Z) - Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment [56.019288564115136]
We propose a novel task called text-to-song synthesis which incorporating both vocals and accompaniments generation.
We develop Melodist, a two-stage text-to-song method that consists of singing voice synthesis (SVS) and vocal-to-accompaniment (V2A) synthesis.
evaluation results on our dataset demonstrate that Melodist can synthesize songs with comparable quality and style consistency.
arXiv Detail & Related papers (2024-04-14T18:00:05Z) - Singer Identity Representation Learning using Self-Supervised Techniques [0.0]
We propose a framework for training singer identity encoders to extract representations suitable for various singing-related tasks.
We explore different self-supervised learning techniques on a large collection of isolated vocal tracks.
We evaluate the quality of the resulting representations on singer similarity and identification tasks.
arXiv Detail & Related papers (2024-01-10T10:41:38Z) - AudioLM: a Language Modeling Approach to Audio Generation [59.19364975706805]
We introduce AudioLM, a framework for high-quality audio generation with long-term consistency.
We show how existing audio tokenizers provide different trade-offs between reconstruction quality and long-term structure.
We demonstrate how our approach extends beyond speech by generating coherent piano music continuations.
arXiv Detail & Related papers (2022-09-07T13:40:08Z) - Learning the Beauty in Songs: Neural Singing Voice Beautifier [69.21263011242907]
We are interested in a novel task, singing voice beautifying (SVB)
Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre.
We introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task.
arXiv Detail & Related papers (2022-02-27T03:10:12Z) - A cappella: Audio-visual Singing Voice Separation [4.6453787256723365]
We explore the single-channel singing voice separation problem from a multimodal perspective.
We present Acappella, a dataset spanning around 46 hours of a cappella solo singing videos sourced from YouTube.
We propose Y-Net, an audio-visual convolutional neural network which achieves state-of-the-art singing voice separation results.
arXiv Detail & Related papers (2021-04-20T13:17:06Z) - Unsupervised Cross-Domain Singing Voice Conversion [105.1021715879586]
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Our method utilizes both an acoustic model, trained for the task of automatic speech recognition, together with melody extracted features to drive a waveform-based generator.
arXiv Detail & Related papers (2020-08-06T18:29:11Z) - Addressing the confounds of accompaniments in singer identification [29.949390919663596]
We employ open-unmix, an open source tool with state-of-the-art performance in source separation, to separate the vocal and instrumental tracks of music.
We then investigate two means to train a singer identification model: by learning from the separated vocal only, or from an augmented set of data.
arXiv Detail & Related papers (2020-02-17T07:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.