Are Words Enough? On the semantic conditioning of affective music
generation
- URL: http://arxiv.org/abs/2311.03624v1
- Date: Tue, 7 Nov 2023 00:19:09 GMT
- Title: Are Words Enough? On the semantic conditioning of affective music
generation
- Authors: Jorge Forero, Gilberto Bernardes, M\'onica Mendes
- Abstract summary: This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions.
In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models.
We conclude that overcoming the limitation and ambiguity of language to express emotions through music has the potential to impact the creative industries.
- Score: 1.534667887016089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music has been commonly recognized as a means of expressing emotions. In this
sense, an intense debate emerges from the need to verbalize musical emotions.
This concern seems highly relevant today, considering the exponential growth of
natural language processing using deep learning models where it is possible to
prompt semantic propositions to generate music automatically. This scoping
review aims to analyze and discuss the possibilities of music generation
conditioned by emotions. To address this topic, we propose a historical
perspective that encompasses the different disciplines and methods contributing
to this topic. In detail, we review two main paradigms adopted in automatic
music generation: rules-based and machine-learning models. Of note are the deep
learning architectures that aim to generate high-fidelity music from textual
descriptions. These models raise fundamental questions about the expressivity
of music, including whether emotions can be represented with words or expressed
through them. We conclude that overcoming the limitation and ambiguity of
language to express emotions through music, some of the use of deep learning
with natural language has the potential to impact the creative industries by
providing powerful tools to prompt and generate new musical works.
Related papers
- Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation [59.81482518924723]
We propose a method for capturing and generating subtle shifts for talking-head generation.
We develop a talking-head framework that is capable of generating a variety of emotions with precise control over intensity levels.
Experiments and analyses validate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-09-29T01:02:01Z) - A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding.
We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z) - Emotion Manipulation Through Music -- A Deep Learning Interactive Visual Approach [0.0]
We introduce a novel way to manipulate the emotional content of a song using AI tools.
Our goal is to achieve the desired emotion while leaving the original melody as intact as possible.
This research may contribute to on-demand custom music generation, the automated remixing of existing work, and music playlists tuned for emotional progression.
arXiv Detail & Related papers (2024-06-12T20:12:29Z) - Think out Loud: Emotion Deducing Explanation in Dialogues [57.90554323226896]
We propose a new task "Emotion Deducing Explanation in Dialogues" (EDEN)
EDEN recognizes emotion and causes in an explicitly thinking way.
It can help Large Language Models (LLMs) achieve better recognition of emotions and causes.
arXiv Detail & Related papers (2024-06-07T08:58:29Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains [61.50113532215864]
Causal Emotion Entailment (CEE) aims to identify the causal utterances in a conversation that stimulate the emotions expressed in a target utterance.
Current works in CEE mainly focus on modeling semantic and emotional interactions in conversations.
We introduce a step-by-step reasoning method, Emotion-Cause Reasoning Chain (ECR-Chain), to infer the stimulus from the target emotional expressions in conversations.
arXiv Detail & Related papers (2024-05-17T15:45:08Z) - Exploring and Applying Audio-Based Sentiment Analysis in Music [0.0]
The ability of a computational model to interpret musical emotions is largely unexplored.
This study seeks to (1) predict the emotion of a musical clip over time and (2) determine the next emotion value after the music in a time series to ensure seamless transitions.
arXiv Detail & Related papers (2024-02-22T22:34:06Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - REMAST: Real-time Emotion-based Music Arrangement with Soft Transition [29.34094293561448]
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.
We propose REMAST to achieve emotion real-time fit and smooth transition simultaneously.
According to the evaluation results, REMAST surpasses the state-of-the-art methods in objective and subjective metrics.
arXiv Detail & Related papers (2023-05-14T00:09:48Z) - Musical Prosody-Driven Emotion Classification: Interpreting Vocalists
Portrayal of Emotions Through Machine Learning [0.0]
The role of musical prosody remains under-explored despite several studies demonstrating a strong connection between prosody and emotion.
In this study, we restrict the input of traditional machine learning algorithms to the features of musical prosody.
We utilize a methodology for individual data collection from vocalists, and personal ground truth labeling by the artist themselves.
arXiv Detail & Related papers (2021-06-04T15:40:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.