Revise, Reason, and Recognize: LLM-Based Emotion Recognition via Emotion-Specific Prompts and ASR Error Correction
- URL: http://arxiv.org/abs/2409.15551v1
- Date: Mon, 23 Sep 2024 21:07:06 GMT
- Title: Revise, Reason, and Recognize: LLM-Based Emotion Recognition via Emotion-Specific Prompts and ASR Error Correction
- Authors: Yuanchao Li, Yuan Gong, Chao-Han Huck Yang, Peter Bell, Catherine Lai,
- Abstract summary: We propose novel prompts that incorporate emotion-specific knowledge from acoustics, linguistics, and psychology.
Experiments on context-aware learning, in-context learning, and instruction tuning are performed to examine the usefulness of LLM training schemes.
Our study aims to refine the use of LLMs in emotion recognition and related domains.
- Score: 31.677026213735363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Annotating and recognizing speech emotion using prompt engineering has recently emerged with the advancement of Large Language Models (LLMs), yet its efficacy and reliability remain questionable. In this paper, we conduct a systematic study on this topic, beginning with the proposal of novel prompts that incorporate emotion-specific knowledge from acoustics, linguistics, and psychology. Subsequently, we examine the effectiveness of LLM-based prompting on Automatic Speech Recognition (ASR) transcription, contrasting it with ground-truth transcription. Furthermore, we propose a Revise-Reason-Recognize prompting pipeline for robust LLM-based emotion recognition from spoken language with ASR errors. Additionally, experiments on context-aware learning, in-context learning, and instruction tuning are performed to examine the usefulness of LLM training schemes in this direction. Finally, we investigate the sensitivity of LLMs to minor prompt variations. Experimental results demonstrate the efficacy of the emotion-specific prompts, ASR error correction, and LLM training schemes for LLM-based emotion recognition. Our study aims to refine the use of LLMs in emotion recognition and related domains.
Related papers
- Exploring Knowledge Tracing in Tutor-Student Dialogues [53.52699766206808]
We present a first attempt at performing knowledge tracing (KT) in tutor-student dialogues.
We propose methods to identify the knowledge components/skills involved in each dialogue turn.
We then apply a range of KT methods on the resulting labeled data to track student knowledge levels over an entire dialogue.
arXiv Detail & Related papers (2024-09-24T22:31:39Z) - EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models [41.699045246349385]
This paper presents a novel framework named EmotionQueen for evaluating the emotional intelligence of large language models (LLMs)
The framework includes four distinctive tasks: Key Event Recognition, Mixed Event Recognition, Implicit Emotional Recognition, and Intention Recognition.
Experiments yield significant conclusions about LLMs' capabilities and limitations in emotion intelligence.
arXiv Detail & Related papers (2024-09-20T09:44:51Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Beyond Silent Letters: Amplifying LLMs in Emotion Recognition with Vocal Nuances [3.396456345114466]
We propose SpeechCueLLM, a method that translates speech characteristics into natural language descriptions.
We evaluate SpeechCueLLM on two datasets: IEMOCAP and MELD, showing significant improvements in emotion recognition accuracy.
arXiv Detail & Related papers (2024-07-31T03:53:14Z) - Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue.
This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis.
We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - Detecting LLM-Assisted Writing in Scientific Communication: Are We There Yet? [2.894383634912343]
Large Language Models (LLMs) have significantly reshaped text generation, particularly in the realm of writing assistance.
A potential avenue to encourage accurate acknowledging of LLM-assisted writing involves employing automated detectors.
Our evaluation of four cutting-edge LLM-generated text detectors reveals their suboptimal performance compared to a simple ad-hoc detector.
arXiv Detail & Related papers (2024-01-30T08:07:28Z) - Towards ASR Robust Spoken Language Understanding Through In-Context
Learning With Word Confusion Networks [68.79880423713597]
We introduce a method that utilizes the ASR system's lattice output instead of relying solely on the top hypothesis.
Our in-context learning experiments, covering spoken question answering and intent classification, underline the LLM's resilience to noisy speech transcripts.
arXiv Detail & Related papers (2024-01-05T17:58:10Z) - Customising General Large Language Models for Specialised Emotion
Recognition Tasks [24.822342337306363]
We investigate how large language models (LLMs) perform in linguistic emotion recognition.
Specifically, we exemplify a publicly available and widely used LLM -- Chat General Language Model.
We customise it for our target by using two different modal adaptation techniques, i.e., deep prompt tuning and low-rank adaptation.
The experimental results show that the adapted LLM can easily outperform other state-of-the-art but specialised deep models.
arXiv Detail & Related papers (2023-10-22T08:09:13Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z) - Exploring the Integration of Large Language Models into Automatic Speech
Recognition Systems: An Empirical Study [0.0]
This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems.
Our primary focus is to investigate the potential of using an LLM's in-context learning capabilities to enhance the performance of ASR systems.
arXiv Detail & Related papers (2023-07-13T02:31:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.