Tollywood Emotions: Annotation of Valence-Arousal in Telugu Song Lyrics
- URL: http://arxiv.org/abs/2303.09364v1
- Date: Thu, 16 Mar 2023 14:47:52 GMT
- Title: Tollywood Emotions: Annotation of Valence-Arousal in Telugu Song Lyrics
- Authors: R Guru Ravi Shanker, B Manikanta Gupta, BV Koushik, Vinoo Alluri
- Abstract summary: We present a new manually annotated dataset of Telugu songs' lyrics collected from Spotify.
We create two music emotion recognition models by using two classification techniques.
We make the dataset publicly available with lyrics, annotations and Spotify IDs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Emotion recognition from a given music track has heavily relied on acoustic
features, social tags, and metadata but is seldom focused on lyrics. There are
no datasets of Indian language songs that contain both valence and arousal
manual ratings of lyrics. We present a new manually annotated dataset of Telugu
songs' lyrics collected from Spotify with valence and arousal annotated on a
discrete scale. A fairly high inter-annotator agreement was observed for both
valence and arousal. Subsequently, we create two music emotion recognition
models by using two classification techniques to identify valence, arousal and
respective emotion quadrant from lyrics. Support vector machine (SVM) with term
frequency-inverse document frequency (TF-IDF) features and fine-tuning the
pre-trained XLMRoBERTa (XLM-R) model were used for valence, arousal and
quadrant classification tasks. Fine-tuned XLMRoBERTa performs better than the
SVM by improving macro-averaged F1-scores of 54.69%, 67.61%, 34.13% to 77.90%,
80.71% and 58.33% for valence, arousal and quadrant classifications,
respectively, on 10-fold cross-validation. In addition, we compare our lyrics
annotations with Spotify's annotations of valence and energy (same as arousal),
which are based on entire music tracks. The implications of our findings are
discussed. Finally, we make the dataset publicly available with lyrics,
annotations and Spotify IDs.
Related papers
- Song Emotion Classification of Lyrics with Out-of-Domain Data under Label Scarcity [0.0]
There is a scarcity of large, high quality in-domain datasets for lyrics-based song emotion classification.
CNN models trained on a large Reddit comments dataset achieve satisfactory performance and generalizability to lyrical emotion classification.
arXiv Detail & Related papers (2024-10-08T07:58:15Z) - Towards Estimating Personal Values in Song Lyrics [5.170818712089796]
Most music widely consumed in Western Countries contains song lyrics, with U.S. samples reporting almost all of their song libraries contain lyrics.
In this project, we take a perspectivist approach, guided by social science theory, to gathering annotations, estimating their quality, and aggregating them.
We then compare aggregated ratings to estimates based on pre-trained sentence/word embedding models by employing a validated value dictionary.
arXiv Detail & Related papers (2024-08-22T19:22:55Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - Modeling Emotional Trajectories in Written Stories Utilizing Transformers and Weakly-Supervised Learning [47.02027575768659]
We introduce continuous valence and arousal labels for an existing dataset of children's stories originally annotated with discrete emotion categories.
For predicting the thus obtained emotionality signals, we fine-tune a DeBERTa model and improve upon this baseline via a weakly supervised learning approach.
A detailed analysis shows the extent to which the results vary depending on factors such as the author, the individual story, or the section within the story.
arXiv Detail & Related papers (2024-06-04T12:17:16Z) - Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark [2.6297569393407416]
We introduce Jam-ALT, a new lyrics transcription benchmark based on the JamendoLyrics dataset.
First, a complete revision of the transcripts, geared specifically towards ALT evaluation.
Second, a suite of evaluation metrics designed, unlike the traditional word error rate, to capture such phenomena.
arXiv Detail & Related papers (2023-11-23T13:13:48Z) - LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT [48.28624219567131]
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method.
We use Whisper, a weakly supervised robust speech recognition model, and GPT-4, today's most performant chat-based large language model.
Our experiments show that LyricWhiz significantly reduces Word Error Rate compared to existing methods in English.
arXiv Detail & Related papers (2023-06-29T17:01:51Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Multi-Modality in Music: Predicting Emotion in Music from High-Level
Audio Features and Lyrics [0.0]
This paper aims to test whether a multi-modal approach for music emotion recognition (MER) performs better than a uni-modal one on high-level song features and lyrics.
We use 11 song features retrieved from the Spotify API, combined lyrics features including sentiment, TF-IDF, and Anew to predict valence and arousal.
arXiv Detail & Related papers (2023-02-26T13:38:42Z) - The Contribution of Lyrics and Acoustics to Collaborative Understanding
of Mood [7.426508199697412]
We study the association between song lyrics and mood through a data-driven analysis.
Our data set consists of nearly one million songs, with song-mood associations derived from user playlists on the Spotify streaming platform.
We take advantage of state-of-the-art natural language processing models based on transformers to learn the association between the lyrics and moods.
arXiv Detail & Related papers (2022-05-31T19:58:41Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z) - Emotion-Based End-to-End Matching Between Image and Music in
Valence-Arousal Space [80.49156615923106]
Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger.
Existing emotion-based image and music matching methods either employ limited categorical emotion states or train the matching model using an impractical multi-stage pipeline.
In this paper, we study end-to-end matching between image and music based on emotions in the continuous valence-arousal (VA) space.
arXiv Detail & Related papers (2020-08-22T20:12:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.