Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
- URL: http://arxiv.org/abs/2311.13987v1
- Date: Thu, 23 Nov 2023 13:13:48 GMT
- Title: Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
- Authors: Ond\v{r}ej C\'ifka, Constantinos Dimitriou, Cheng-i Wang, Hendrik
Schreiber, Luke Miner, Fabian-Robert St\"oter
- Abstract summary: We introduce Jam-ALT, a new lyrics transcription benchmark based on the JamendoLyrics dataset.
First, a complete revision of the transcripts, geared specifically towards ALT evaluation.
Second, a suite of evaluation metrics designed, unlike the traditional word error rate, to capture such phenomena.
- Score: 2.6297569393407416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current automatic lyrics transcription (ALT) benchmarks focus exclusively on
word content and ignore the finer nuances of written lyrics including
formatting and punctuation, which leads to a potential misalignment with the
creative products of musicians and songwriters as well as listeners'
experiences. For example, line breaks are important in conveying information
about rhythm, emotional emphasis, rhyme, and high-level structure. To address
this issue, we introduce Jam-ALT, a new lyrics transcription benchmark based on
the JamendoLyrics dataset. Our contribution is twofold. Firstly, a complete
revision of the transcripts, geared specifically towards ALT evaluation by
following a newly created annotation guide that unifies the music industry's
guidelines, covering aspects such as punctuation, line breaks, spelling,
background vocals, and non-word sounds. Secondly, a suite of evaluation metrics
designed, unlike the traditional word error rate, to capture such phenomena. We
hope that the proposed benchmark contributes to the ALT task, enabling more
precise and reliable assessments of transcription systems and enhancing the
user experience in lyrics applications such as subtitle renderings for live
captioning or karaoke.
Related papers
- REFFLY: Melody-Constrained Lyrics Editing Model [50.03960548399128]
We introduce REFFLY, the first revision framework designed to edit arbitrary forms of plain text draft into high-quality, full-fledged song lyrics.
Our approach ensures that the generated lyrics retain the original meaning of the draft, align with the melody, and adhere to the desired song structures.
arXiv Detail & Related papers (2024-08-30T23:22:34Z) - Lyrics Transcription for Humans: A Readability-Aware Benchmark [1.2499537119440243]
We introduce Jam-ALT, a comprehensive lyrics transcription benchmark.
The benchmark features a complete revision of the JamendoLyrics dataset, along with evaluation metrics designed to capture and assess the lyric-specific nuances.
We apply the benchmark to recent transcription systems and present additional error analysis, as well as an experimental comparison with a classical music dataset.
arXiv Detail & Related papers (2024-07-30T14:20:09Z) - A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features
For Classical Vocal Performance [7.488651253072641]
The goal of real-time lyrics alignment is to take live singing audio as input and pinpoint the exact position within given lyrics on the fly.
The task can benefit real-world applications such as the automatic subtitling of live concerts or operas.
This paper presents a real-time lyrics alignment system for classical vocal performances with two contributions.
arXiv Detail & Related papers (2024-01-17T13:25:32Z) - LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT [48.28624219567131]
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method.
We use Whisper, a weakly supervised robust speech recognition model, and GPT-4, today's most performant chat-based large language model.
Our experiments show that LyricWhiz significantly reduces Word Error Rate compared to existing methods in English.
arXiv Detail & Related papers (2023-06-29T17:01:51Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - Bridging Music and Text with Crowdsourced Music Comments: A
Sequence-to-Sequence Framework for Thematic Music Comments Generation [18.2750732408488]
We exploit the crowd-sourced music comments to construct a new dataset and propose a sequence-to-sequence model to generate text descriptions of music.
To enhance the authenticity and thematicity of generated texts, we propose a discriminator and a novel topic evaluator.
arXiv Detail & Related papers (2022-09-05T14:51:51Z) - SongMASS: Automatic Song Writing with Pre-training and Alignment
Constraint [54.012194728496155]
SongMASS is proposed to overcome the challenges of lyric-to-melody generation and melody-to-lyric generation.
It leverages masked sequence to sequence (MASS) pre-training and attention based alignment modeling.
We show that SongMASS generates lyric and melody with significantly better quality than the baseline method.
arXiv Detail & Related papers (2020-12-09T16:56:59Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.