Related papers: Exploring and Applying Audio-Based Sentiment Analysis in Music

Exploring and Applying Audio-Based Sentiment Analysis in Music

URL: http://arxiv.org/abs/2403.17379v1
Date: Thu, 22 Feb 2024 22:34:06 GMT
Title: Exploring and Applying Audio-Based Sentiment Analysis in Music
Authors: Etash Jhanji,
Abstract summary: The ability of a computational model to interpret musical emotions is largely unexplored. This study seeks to (1) predict the emotion of a musical clip over time and (2) determine the next emotion value after the music in a time series to ensure seamless transitions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sentiment analysis is a continuously explored area of text processing that deals with the computational analysis of opinions, sentiments, and subjectivity of text. However, this idea is not limited to text and speech, in fact, it could be applied to other modalities. In reality, humans do not express themselves in text as deeply as they do in music. The ability of a computational model to interpret musical emotions is largely unexplored and could have implications and uses in therapy and musical queuing. In this paper, two individual tasks are addressed. This study seeks to (1) predict the emotion of a musical clip over time and (2) determine the next emotion value after the music in a time series to ensure seamless transitions. Utilizing data from the Emotions in Music Database, which contains clips of songs selected from the Free Music Archive annotated with levels of valence and arousal as reported on Russel's circumplex model of affect by multiple volunteers, models are trained for both tasks. Overall, the performance of these models reflected that they were able to perform the tasks they were designed for effectively and accurately.

Related papers

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations. We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music. Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z)
Joint sentiment analysis of lyrics and audio in music [1.2349562761400057]
In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods. We first evaluate various models for sentiment analysis based on lyrics and audio separately. The corresponding approaches already show satisfactory results, but they also exhibit weaknesses.
arXiv Detail & Related papers (2024-05-03T10:42:17Z)
Are Words Enough? On the semantic conditioning of affective music generation [1.534667887016089]
This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models. We conclude that overcoming the limitation and ambiguity of language to express emotions through music has the potential to impact the creative industries.
arXiv Detail & Related papers (2023-11-07T00:19:09Z)
REMAST: Real-time Emotion-based Music Arrangement with Soft Transition [29.34094293561448]
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies. We propose REMAST to achieve emotion real-time fit and smooth transition simultaneously. According to the evaluation results, REMAST surpasses the state-of-the-art methods in objective and subjective metrics.
arXiv Detail & Related papers (2023-05-14T00:09:48Z)
Affective Idiosyncratic Responses to Music [63.969810774018775]
We develop methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform. We test for musical, lyrical, contextual, demographic, and mental health effects that drive listener affective responses.
arXiv Detail & Related papers (2022-10-17T19:57:46Z)
Song Emotion Recognition: a Performance Comparison Between Audio Features and Artificial Neural Networks [0.0]
We study the most common features and models used to tackle this problem, revealing which ones are best suited for recognizing emotion in a cappella songs. In this paper, we studied the most common features and models used in recent publications to tackle this problem, revealing which ones are best suited for recognizing emotion in a cappella songs.
arXiv Detail & Related papers (2022-09-24T16:13:25Z)
A Novel Multi-Task Learning Method for Symbolic Music Emotion Recognition [76.65908232134203]
Symbolic Music Emotion Recognition(SMER) is to predict music emotion from symbolic data, such as MIDI and MusicXML. In this paper, we present a simple multi-task framework for SMER, which incorporates the emotion recognition task with other emotion-related auxiliary tasks.
arXiv Detail & Related papers (2022-01-15T07:45:10Z)
Musical Prosody-Driven Emotion Classification: Interpreting Vocalists Portrayal of Emotions Through Machine Learning [0.0]
The role of musical prosody remains under-explored despite several studies demonstrating a strong connection between prosody and emotion. In this study, we restrict the input of traditional machine learning algorithms to the features of musical prosody. We utilize a methodology for individual data collection from vocalists, and personal ground truth labeling by the artist themselves.
arXiv Detail & Related papers (2021-06-04T15:40:19Z)
Emotion Carrier Recognition from Personal Narratives [74.24768079275222]
Personal Narratives (PNs) are recollections of facts, events, and thoughts from one's own experience. We propose a novel task for Narrative Understanding: Emotion Carrier Recognition (ECR)
arXiv Detail & Related papers (2020-08-17T17:16:08Z)
Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music. We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)
Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective. The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone. The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.