Abusive music and song transformation using GenAI and LLMs
- URL: http://arxiv.org/abs/2601.15348v1
- Date: Wed, 21 Jan 2026 02:56:45 GMT
- Title: Abusive music and song transformation using GenAI and LLMs
- Authors: Jiyang Choi, Rohitash Chandra,
- Abstract summary: This study explores the use of generative artificial intelligence (GenAI) and Large Language Models (LLMs) to automatically transform abusive words (vocal delivery) and lyrical content in popular music.<n>We present a comparative analysis of four selected English songs and their transformed counterparts, evaluating changes through both acoustic and sentiment-based lenses.<n>Our findings indicate that Gen-AI significantly reduces vocal aggressiveness, with acoustic analysis showing improvements in Harmonic to Noise Ratio, Cepstral Peak Prominence, and Shimmer.
- Score: 3.8271803328378677
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Repeated exposure to violence and abusive content in music and song content can influence listeners' emotions and behaviours, potentially normalising aggression or reinforcing harmful stereotypes. In this study, we explore the use of generative artificial intelligence (GenAI) and Large Language Models (LLMs) to automatically transform abusive words (vocal delivery) and lyrical content in popular music. Rather than simply muting or replacing a single word, our approach transforms the tone, intensity, and sentiment, thus not altering just the lyrics, but how it is expressed. We present a comparative analysis of four selected English songs and their transformed counterparts, evaluating changes through both acoustic and sentiment-based lenses. Our findings indicate that Gen-AI significantly reduces vocal aggressiveness, with acoustic analysis showing improvements in Harmonic to Noise Ratio, Cepstral Peak Prominence, and Shimmer. Sentiment analysis reduced aggression by 63.3-85.6\% across artists, with major improvements in chorus sections (up to 88.6\% reduction). The transformed versions maintained musical coherence while mitigating harmful content, offering a promising alternative to traditional content moderation that avoids triggering the "forbidden fruit" effect, where the censored content becomes more appealing simply because it is restricted. This approach demonstrates the potential for GenAI to create safer listening experiences while preserving artistic expression.
Related papers
- Language models for longitudinal analysis of abusive content in Billboard Music Charts [3.2654923574107357]
We analyse songs (lyrics) from Billboard Charts of the United States in the last seven decades.<n>Results show a significant rise in explicit content in popular music from 1990 onwards.<n>An increasing prevalence of songs with lyrics containing profane, sexually explicit, and otherwise inappropriate language.
arXiv Detail & Related papers (2025-10-06T01:59:21Z) - Bob's Confetti: Phonetic Memorization Attacks in Music and Video Generation [47.04195212078377]
Generative AI systems for music and video commonly use text-based filters to prevent the regurgitation of copyrighted material.<n>We introduce Adversarial PhoneTic Prompting (APT), a novel attack that bypasses these safeguards by exploiting phonetic memorization.<n>We demonstrate that leading Lyrics-to-Song (L2S) models like SUNO and YuE regenerate songs with striking melodic and rhythmic similarity to their copyrighted originals when prompted with these altered lyrics.
arXiv Detail & Related papers (2025-07-23T21:11:47Z) - Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion [53.26424100244925]
Expressive voice conversion aims to transfer both speaker identity and expressive attributes from a target speech to a given source speech.<n>In this work, we improve over a self-supervised, non-autoregressive framework with a conditional variational autoencoder.
arXiv Detail & Related papers (2025-06-04T14:42:12Z) - Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation [59.81482518924723]
We propose a method for capturing and generating subtle shifts for talking-head generation.
We develop a talking-head framework that is capable of generating a variety of emotions with precise control over intensity levels.
Experiments and analyses validate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-09-29T01:02:01Z) - Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings [10.302353984541497]
This research develops a model capable of generating music that resonates with the emotions depicted in visual arts.
Addressing the scarcity of aligned art and music data, we curated the Emotion Painting Music dataset.
Our dual-stage framework converts images to text descriptions of emotional content and then transforms these descriptions into music, facilitating efficient learning with minimal data.
arXiv Detail & Related papers (2024-09-12T08:19:25Z) - Joint sentiment analysis of lyrics and audio in music [1.2349562761400057]
In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods.
We first evaluate various models for sentiment analysis based on lyrics and audio separately. The corresponding approaches already show satisfactory results, but they also exhibit weaknesses.
arXiv Detail & Related papers (2024-05-03T10:42:17Z) - Are Words Enough? On the semantic conditioning of affective music
generation [1.534667887016089]
This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions.
In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models.
We conclude that overcoming the limitation and ambiguity of language to express emotions through music has the potential to impact the creative industries.
arXiv Detail & Related papers (2023-11-07T00:19:09Z) - REMAST: Real-time Emotion-based Music Arrangement with Soft Transition [29.34094293561448]
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.
We propose REMAST to achieve emotion real-time fit and smooth transition simultaneously.
According to the evaluation results, REMAST surpasses the state-of-the-art methods in objective and subjective metrics.
arXiv Detail & Related papers (2023-05-14T00:09:48Z) - Affective Idiosyncratic Responses to Music [63.969810774018775]
We develop methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform.
We test for musical, lyrical, contextual, demographic, and mental health effects that drive listener affective responses.
arXiv Detail & Related papers (2022-10-17T19:57:46Z) - Contrastive Learning with Positive-Negative Frame Mask for Music
Representation [91.44187939465948]
This paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR.
We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music.
arXiv Detail & Related papers (2022-03-17T07:11:42Z) - Textless Speech Emotion Conversion using Decomposed and Discrete
Representations [49.55101900501656]
We decompose speech into discrete and disentangled learned representations, consisting of content units, F0, speaker, and emotion.
First, we modify the speech content by translating the content units to a target emotion, and then predict the prosodic features based on these units.
Finally, the speech waveform is generated by feeding the predicted representations into a neural vocoder.
arXiv Detail & Related papers (2021-11-14T18:16:42Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.