Related papers: Words to Waves: Emotion-Adaptive Music Recommendation System

Words to Waves: Emotion-Adaptive Music Recommendation System

URL: http://arxiv.org/abs/2510.21724v1
Date: Wed, 17 Sep 2025 15:35:03 GMT
Title: Words to Waves: Emotion-Adaptive Music Recommendation System
Authors: Apoorva Chavali, Reeve Menezes,
Abstract summary: This paper introduces a novel music recommendation framework employing a variant of Wide and Deep Learning architecture.<n>The framework takes in real-time emotional states inferred directly from natural language as inputs and recommends songs that closely portray the mood.<n> Experimental results show that personalized music selections positively influence the user's emotions and lead to a significant improvement in emotional relevance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current recommendation systems often tend to overlook emotional context and rely on historical listening patterns or static mood tags. This paper introduces a novel music recommendation framework employing a variant of Wide and Deep Learning architecture that takes in real-time emotional states inferred directly from natural language as inputs and recommends songs that closely portray the mood. The system captures emotional contexts from user-provided textual descriptions by using transformer-based embeddings, which were finetuned to predict the emotional dimensions of valence-arousal. The deep component of the architecture utilizes these embeddings to generalize unseen emotional patterns, while the wide component effectively memorizes user-emotion and emotion-genre associations through cross-product features. Experimental results show that personalized music selections positively influence the user's emotions and lead to a significant improvement in emotional relevance.

Related papers

EmoCAST: Emotional Talking Portrait via Emotive Text Description [56.42674612728354]
EmoCAST is a diffusion-based framework for precise text-driven emotional synthesis.<n>In appearance modeling, emotional prompts are integrated through a text-guided decoupled emotive module.<n>EmoCAST achieves state-of-the-art performance in generating realistic, emotionally expressive, and audio-synchronized talking-head videos.
arXiv Detail & Related papers (2025-08-28T10:02:06Z)
Enriching Multimodal Sentiment Analysis through Textual Emotional Descriptions of Visual-Audio Content [56.62027582702816]
Multimodal Sentiment Analysis seeks to unravel human emotions by amalgamating text, audio, and visual data.<n>Yet, discerning subtle emotional nuances within audio and video expressions poses a formidable challenge.<n>We introduce DEVA, a progressive fusion framework founded on textual sentiment descriptions.
arXiv Detail & Related papers (2024-12-12T11:30:41Z)
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting. To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity. Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z)
MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion [16.658813060879293]
We present a novel approach employing musical element-based regularization in the latent space to disentangle distinct elements. By visualizing latent space, we conclude that MusER yields a disentangled and interpretable latent space. Experimental results demonstrate that MusER outperforms the state-of-the-art models for generating emotional music.
arXiv Detail & Related papers (2023-12-16T03:50:13Z)
Emotion-Aware Music Recommendation System: Enhancing User Experience Through Real-Time Emotional Context [1.3812010983144802]
This study addresses the deficiency in conventional music recommendation systems by focusing on the vital role of emotions in shaping users music choices. It introduces an AI model that incorporates emotional context into the song recommendation process. By accurately detecting users real-time emotions, the model can generate personalized song recommendations that align with the users emotional state.
arXiv Detail & Related papers (2023-11-17T05:55:36Z)
Are Words Enough? On the semantic conditioning of affective music generation [1.534667887016089]
This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models. We conclude that overcoming the limitation and ambiguity of language to express emotions through music has the potential to impact the creative industries.
arXiv Detail & Related papers (2023-11-07T00:19:09Z)
REMAST: Real-time Emotion-based Music Arrangement with Soft Transition [29.34094293561448]
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies. We propose REMAST to achieve emotion real-time fit and smooth transition simultaneously. According to the evaluation results, REMAST surpasses the state-of-the-art methods in objective and subjective metrics.
arXiv Detail & Related papers (2023-05-14T00:09:48Z)
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible Knowledge Selection [47.60224978460442]
We propose a Serial and Emotion-Knowledge interaction (SEEK) method for empathetic dialogue generation. We use a fine-grained encoding strategy which is more sensitive to the emotion dynamics (emotion flow) in the conversations to predict the emotion-intent characteristic of response. Besides, we design a novel framework to model the interaction between knowledge and emotion to generate more sensible response.
arXiv Detail & Related papers (2022-10-21T03:51:18Z)
Emotion Intensity and its Control for Emotional Voice Conversion [77.05097999561298]
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while preserving the linguistic content and speaker identity. In this paper, we aim to explicitly characterize and control the intensity of emotion. We propose to disentangle the speaker style from linguistic content and encode the speaker style into a style embedding in a continuous space that forms the prototype of emotion embedding.
arXiv Detail & Related papers (2022-01-10T02:11:25Z)
Musical Prosody-Driven Emotion Classification: Interpreting Vocalists Portrayal of Emotions Through Machine Learning [0.0]
The role of musical prosody remains under-explored despite several studies demonstrating a strong connection between prosody and emotion. In this study, we restrict the input of traditional machine learning algorithms to the features of musical prosody. We utilize a methodology for individual data collection from vocalists, and personal ground truth labeling by the artist themselves.
arXiv Detail & Related papers (2021-06-04T15:40:19Z)
Knowledge Bridging for Empathetic Dialogue Generation [52.39868458154947]
Lack of external knowledge makes empathetic dialogue systems difficult to perceive implicit emotions and learn emotional interactions from limited dialogue history. We propose to leverage external knowledge, including commonsense knowledge and emotional lexical knowledge, to explicitly understand and express emotions in empathetic dialogue generation.
arXiv Detail & Related papers (2020-09-21T09:21:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.