Related papers: Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

URL: http://arxiv.org/abs/2303.15705v1
Date: Tue, 28 Mar 2023 03:17:59 GMT
Title: Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics
Authors: Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu
Abstract summary: We propose Lyrics-Melody Translation with Adaptive Grouping (LTAG) as a holistic solution to automatic song translation. It is a novel encoder-decoder framework that can simultaneously translate the source lyrics and determine the number of aligned notes at each decoding step. Experiments conducted on an English-Chinese song translation data set show the effectiveness of our model in both automatic and human evaluation.
Score: 38.35809268026605
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process. In this paper, we propose Lyrics-Melody Translation with Adaptive Grouping (LTAG), a holistic solution to automatic song translation by jointly modeling lyrics translation and lyrics-melody alignment. It is a novel encoder-decoder framework that can simultaneously translate the source lyrics and determine the number of aligned notes at each decoding step through an adaptive note grouping module. To address data scarcity, we commissioned a small amount of training data annotated specifically for this task and used large amounts of augmented data through back-translation. Experiments conducted on an English-Chinese song translation data set show the effectiveness of our model in both automatic and human evaluation.

Related papers

MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation [21.45108062752738]
We introduce Multilingual Audio-Video Lyrics Benchmark for Animated Song Translation (MAVL), the first multilingual, multimodal benchmark for singable lyrics translation.<n>We propose Syllable-Constrained Audio-Video LLM with Chain-of-Thought SylAVL-CoT, which leverages audio-video cues and enforces syllabic constraints to produce natural-sounding lyrics.<n> Experimental results demonstrate that SylAVL-CoT significantly outperforms text-based models in singability and contextual accuracy.
arXiv Detail & Related papers (2025-05-24T09:28:09Z)
Sing it, Narrate it: Quality Musical Lyrics Translation [0.5735035463793009]
Existing song translation approaches often prioritize singability constraints at the expense of translation quality. This paper aims to enhance translation quality while maintaining key singability features.
arXiv Detail & Related papers (2024-10-29T14:23:56Z)
REFFLY: Melody-Constrained Lyrics Editing Model [50.03960548399128]
We introduce REFFLY, the first revision framework designed to edit arbitrary forms of plain text draft into high-quality, full-fledged song lyrics. Our approach ensures that the generated lyrics retain the original meaning of the draft, align with the melody, and adhere to the desired song structures.
arXiv Detail & Related papers (2024-08-30T23:22:34Z)
Towards Estimating Personal Values in Song Lyrics [5.170818712089796]
Most music widely consumed in Western Countries contains song lyrics, with U.S. samples reporting almost all of their song libraries contain lyrics. In this project, we take a perspectivist approach, guided by social science theory, to gathering annotations, estimating their quality, and aggregating them. We then compare aggregated ratings to estimates based on pre-trained sentence/word embedding models by employing a validated value dictionary.
arXiv Detail & Related papers (2024-08-22T19:22:55Z)
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition [82.38021790213752]
SongComposer is a music-specialized large language model (LLM)<n>It integrates the capability of simultaneously composing melodies into LLMs by leveraging three key innovations.<n>It outperforms advanced LLMs in tasks such as lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation.<n>We will release SongCompose, a large-scale dataset for training, containing paired lyrics and melodies in Chinese and English.
arXiv Detail & Related papers (2024-02-27T16:15:28Z)
A Computational Evaluation Framework for Singable Lyric Translation [17.492053233802135]
We present a computational framework for the quantitative evaluation of singable lyric translation. We measure syllable count distance, phoneme repetition similarity, musical structure distance, and semantic similarity. Our framework seamlessly integrates musical, linguistic, and cultural dimensions of lyrics.
arXiv Detail & Related papers (2023-08-26T00:27:08Z)
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT [48.28624219567131]
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method. We use Whisper, a weakly supervised robust speech recognition model, and GPT-4, today's most performant chat-based large language model. Our experiments show that LyricWhiz significantly reduces Word Error Rate compared to existing methods in English.
arXiv Detail & Related papers (2023-06-29T17:01:51Z)
Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints. Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z)
Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data. We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z)
SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation. It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling. We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.