BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks
- URL: http://arxiv.org/abs/2407.04279v1
- Date: Fri, 5 Jul 2024 06:25:34 GMT
- Title: BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks
- Authors: Jieying Xue, Minh Phuong Nguyen, Blake Matheny, Le Minh Nguyen,
- Abstract summary: This work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation.
By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation.
Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets.
- Score: 2.9873893715462176
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the Emotion Recognition in Conversation task, recent investigations have utilized attention mechanisms exploring relationships among utterances from intra- and inter-speakers for modeling emotional interaction between them. However, attributes such as speaker personality traits remain unexplored and present challenges in terms of their applicability to other tasks or compatibility with diverse model architectures. Therefore, this work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation. By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation as supplementary knowledge injected into the model to classify emotional labels for each utterance. Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets: IEMOCAP, MELD, and EmoryNLP, demonstrating the effectiveness and generalization of our model and showcasing its potential for adaptation to various conversation analysis tasks. Our source code is available at https://github.com/yingjie7/BiosERC.
Related papers
- Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study
on Speech Emotion Recognition [54.952250732643115]
We study Acoustic Word Embeddings (AWEs), a fixed-length feature derived from continuous representations, to explore their advantages in specific tasks.
AWEs have previously shown utility in capturing acoustic discriminability.
Our findings underscore the acoustic context conveyed by AWEs and showcase the highly competitive Speech Emotion Recognition accuracies.
arXiv Detail & Related papers (2024-02-04T21:24:54Z) - InstructERC: Reforming Emotion Recognition in Conversation with Multi-task Retrieval-Augmented Large Language Models [9.611864685207056]
We propose a novel approach, InstructERC, to reformulate the emotion recognition task from a discriminative framework to a generative framework based on Large Language Models (LLMs)
InstructERC makes three significant contributions: (1) it introduces a simple yet effective retrieval template module, which helps the model explicitly integrate multi-granularity dialogue supervision information; (2) we introduce two additional emotion alignment tasks, namely speaker identification and emotion prediction tasks, to implicitly model the dialogue role relationships and future emotional tendencies in conversations; and (3) Pioneeringly, we unify emotion labels across benchmarks through the feeling wheel to fit real application scenarios.
arXiv Detail & Related papers (2023-09-21T09:22:07Z) - Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion
Recognition in Conversation With Emotion Disentanglement [8.17164107060944]
Emotion Recognition in Conversation (ERC) has attracted widespread attention in the natural language processing field.
Existing ERC methods face challenges in achieving generalization to diverse scenarios due to insufficient modeling of context.
We present a Hybrid Continuous Attributive Network (HCAN) to address these issues in the perspective of emotional continuation and emotional attribution.
arXiv Detail & Related papers (2023-09-18T14:18:16Z) - 'What are you referring to?' Evaluating the Ability of Multi-Modal
Dialogue Models to Process Clarificational Exchanges [65.03196674816772]
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee.
Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarification Exchanges (CE): a Clarification Request (CR) and a response.
Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models.
arXiv Detail & Related papers (2023-07-28T13:44:33Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - SI-LSTM: Speaker Hybrid Long-short Term Memory and Cross Modal Attention
for Emotion Recognition in Conversation [16.505046191280634]
Emotion Recognition in Conversation(ERC) is of vital importance for a variety of applications, including intelligent healthcare, artificial intelligence for conversation, and opinion mining over chat history.
The crux of ERC is to model both cross-modality and cross-time interactions throughout the conversation.
Previous methods have made progress in learning the time series information of conversation while lacking the ability to trace down the different emotional states of each speaker in a conversation.
arXiv Detail & Related papers (2023-05-04T10:13:15Z) - Deep Emotion Recognition in Textual Conversations: A Survey [0.8602553195689513]
New applications and implementation scenarios present novel challenges and opportunities.
These range from leveraging the conversational context, speaker, and emotion dynamics modelling, to interpreting common sense expressions.
This survey emphasizes the advantage of leveraging techniques to address unbalanced data.
arXiv Detail & Related papers (2022-11-16T19:42:31Z) - Discourse-Aware Emotion Cause Extraction in Conversations [21.05202596080196]
Emotion Cause Extraction in Conversations (ECEC) aims to extract the utterances which contain the emotional cause in conversations.
We propose a discourse-aware model (DAM) for this task.
Results on the benchmark corpus show that DAM outperform the state-of-theart (SOTA) systems in the literature.
arXiv Detail & Related papers (2022-10-26T02:11:01Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - TSAM: A Two-Stream Attention Model for Causal Emotion Entailment [50.07800752967995]
Causal Emotion Entailment (CEE) aims to discover the potential causes behind an emotion in a conversational utterance.
We classify multiple utterances synchronously to capture the correlations between utterances in a global view.
We propose a Two-Stream Attention Model (TSAM) to effectively model the speaker's emotional influences in the conversational history.
arXiv Detail & Related papers (2022-03-02T02:11:41Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.