E-ffective: A Visual Analytic System for Exploring the Emotion and
Effectiveness of Inspirational Speeches
- URL: http://arxiv.org/abs/2110.14908v2
- Date: Fri, 29 Oct 2021 04:03:41 GMT
- Title: E-ffective: A Visual Analytic System for Exploring the Emotion and
Effectiveness of Inspirational Speeches
- Authors: Kevin Maher, Zeyuan Huang, Jiancheng Song, Xiaoming Deng, Yu-Kun Lai,
Cuixia Ma, Hao Wang, Yong-Jin Liu, Hongan Wang
- Abstract summary: E-ffective is a visual analytic system allowing speaking experts and novices to analyze both the role of speech factors and their contribution in effective speeches.
Two novel visualizations include E-spiral (that shows the emotional shifts in speeches in a visually compact way) and E-script (that connects speech content with key speech delivery information.
- Score: 57.279044079196105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: What makes speeches effective has long been a subject for debate, and until
today there is broad controversy among public speaking experts about what
factors make a speech effective as well as the roles of these factors in
speeches. Moreover, there is a lack of quantitative analysis methods to help
understand effective speaking strategies. In this paper, we propose E-ffective,
a visual analytic system allowing speaking experts and novices to analyze both
the role of speech factors and their contribution in effective speeches. From
interviews with domain experts and investigating existing literature, we
identified important factors to consider in inspirational speeches. We obtained
the generated factors from multi-modal data that were then related to
effectiveness data. Our system supports rapid understanding of critical factors
in inspirational speeches, including the influence of emotions by means of
novel visualization methods and interaction. Two novel visualizations include
E-spiral (that shows the emotional shifts in speeches in a visually compact
way) and E-script (that connects speech content with key speech delivery
information). In our evaluation we studied the influence of our system on
experts' domain knowledge about speech factors. We further studied the
usability of the system by speaking novices and experts on assisting analysis
of inspirational speech effectiveness.
Related papers
- SemEval-2024 Task 3: Multimodal Emotion Cause Analysis in Conversations [53.60993109543582]
SemEval-2024 Task 3, named Multimodal Emotion Cause Analysis in Conversations, aims at extracting all pairs of emotions and their corresponding causes from conversations.
Under different modality settings, it consists of two subtasks: Textual Emotion-Cause Pair Extraction in Conversations (TECPE) and Multimodal Emotion-Cause Pair Extraction in Conversations (MECPE)
In this paper, we introduce the task, dataset and evaluation settings, summarize the systems of the top teams, and discuss the findings of the participants.
arXiv Detail & Related papers (2024-05-19T09:59:00Z) - Multiscale Contextual Learning for Speech Emotion Recognition in
Emergency Call Center Conversations [4.297070083645049]
This paper presents a multi-scale conversational context learning approach for speech emotion recognition.
We investigated this approach on both speech transcriptions and acoustic segments.
According to our tests, the context derived from previous tokens has a more significant influence on accurate prediction than the following tokens.
arXiv Detail & Related papers (2023-08-28T20:31:45Z) - AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect
Transfer for Speech Synthesis [13.918119853846838]
Affect is an emotional characteristic encompassing valence, arousal, and intensity, and is a crucial attribute for enabling authentic conversations.
We propose AffectEcho, an emotion translation model, that uses a Vector Quantized codebook to model emotions within a quantized space.
We demonstrate the effectiveness of our approach in controlling the emotions of generated speech while preserving identity, style, and emotional cadence unique to each speaker.
arXiv Detail & Related papers (2023-08-16T06:28:29Z) - deep learning of segment-level feature representation for speech emotion
recognition in conversations [9.432208348863336]
We propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions.
First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances.
Second, an attentive bi-directional recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly.
arXiv Detail & Related papers (2023-02-05T16:15:46Z) - Explaining (Sarcastic) Utterances to Enhance Affect Understanding in
Multimodal Dialogues [40.80696210030204]
We propose MOSES, a deep neural network, which takes a multimodal (sarcastic) dialogue instance as an input and generates a natural language sentence as its explanation.
We leverage the generated explanation for various natural language understanding tasks in a conversational dialogue setup, such as sarcasm detection, humour identification, and emotion recognition.
Our evaluation shows that MOSES outperforms the state-of-the-art system for SED by an average of 2% on different evaluation metrics.
arXiv Detail & Related papers (2022-11-20T18:05:43Z) - Social Influence Dialogue Systems: A Scoping Survey of the Efforts
Towards Influence Capabilities of Dialogue Systems [50.57882213439553]
Social influence dialogue systems are capable of persuasion, negotiation, and therapy.
There exists no formal definition or category for dialogue systems with these skills.
This study serves as a comprehensive reference for social influence dialogue systems to inspire more dedicated research and discussion in this emerging area.
arXiv Detail & Related papers (2022-10-11T17:57:23Z) - Deep Learning for Visual Speech Analysis: A Survey [54.53032361204449]
This paper presents a review of recent progress in deep learning methods on visual speech analysis.
We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance.
arXiv Detail & Related papers (2022-05-22T14:44:53Z) - An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and
Separation [57.68765353264689]
Speech enhancement and speech separation are two related tasks.
Traditionally, these tasks have been tackled using signal processing and machine learning techniques.
Deep learning has been exploited to achieve strong performance.
arXiv Detail & Related papers (2020-08-21T17:24:09Z) - "Notic My Speech" -- Blending Speech Patterns With Multimedia [65.91370924641862]
We propose a view-temporal attention mechanism to model both the view dependence and the visemic importance in speech recognition and understanding.
Our proposed method outperformed the existing work by 4.99% in terms of the viseme error rate.
We show that there is a strong correlation between our model's understanding of multi-view speech and the human perception.
arXiv Detail & Related papers (2020-06-12T06:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.