Emotion-Aware Embedding Fusion in LLMs (Flan-T5, LLAMA 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation
- URL: http://arxiv.org/abs/2410.01306v2
- Date: Tue, 11 Mar 2025 10:08:37 GMT
- Title: Emotion-Aware Embedding Fusion in LLMs (Flan-T5, LLAMA 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation
- Authors: Abdur Rasool, Muhammad Irfan Shahzad, Hafsa Aslam, Vincent Chan, Muhammad Ali Arshad,
- Abstract summary: This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications.<n>We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention mechanisms.<n>The system can be integrated into existing mental health platforms to generate personalized responses based on retrieved therapy session data.
- Score: 0.5454121013433086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empathetic and coherent responses are critical in auto-mated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention mechanisms to prioritize semantic and emotional features in therapy transcripts. Our approach combines multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as Flan-T5, LLAMA 2, DeepSeek-R1, and ChatGPT 4. Therapy session transcripts, comprising over 2,000 samples are segmented into hierarchical levels (word, sentence, and session) using neural networks, while hierarchical fusion combines these features with pooling techniques to refine emotional representations. Atten-tion mechanisms, including multi-head self-attention and cross-attention, further prioritize emotional and contextual features, enabling temporal modeling of emotion-al shifts across sessions. The processed embeddings, computed using BERT, GPT-3, and RoBERTa are stored in the Facebook AI similarity search vector database, which enables efficient similarity search and clustering across dense vector spaces. Upon user queries, relevant segments are retrieved and provided as context to LLMs, enhancing their ability to generate empathetic and con-textually relevant responses. The proposed framework is evaluated across multiple practical use cases to demonstrate real-world applicability, including AI-driven therapy chatbots. The system can be integrated into existing mental health platforms to generate personalized responses based on retrieved therapy session data.
Related papers
- GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations [35.63053777817013]
GatedxLSTM is a novel multimodal Emotion Recognition in Conversation (ERC) model.
It considers voice and transcripts of both the speaker and their conversational partner to identify the most influential sentences driving emotional shifts.
It achieves state-of-the-art (SOTA) performance among open-source methods in four-class emotion classification.
arXiv Detail & Related papers (2025-03-26T18:46:18Z) - Script-Strategy Aligned Generation: Aligning LLMs with Expert-Crafted Dialogue Scripts and Therapeutic Strategies for Psychotherapy [17.07905574770501]
Current systems rely on rigid, rule-based designs, heavily dependent on expert-crafted scripts for guiding therapeutic conversations.
Recent advances in large language models (LLMs) offer the potential for more flexible interactions, but lack controllability and transparency.
We propose Script-Strategy Aligned Generation (SSAG)'', a flexible alignment approach that reduces reliance on fully scripted content.
arXiv Detail & Related papers (2024-11-11T05:14:14Z) - Towards Empathetic Conversational Recommender Systems [77.53167131692]
We propose an empathetic conversational recommender (ECR) framework.
ECR contains two main modules: emotion-aware item recommendation and emotion-aligned response generation.
Our experiments on the ReDial dataset validate the efficacy of our framework in enhancing recommendation accuracy and improving user satisfaction.
arXiv Detail & Related papers (2024-08-30T15:43:07Z) - Are Large Language Models Possible to Conduct Cognitive Behavioral Therapy? [13.0263170692984]
Large language models (LLMs) have been validated, providing new possibilities for psychological assistance therapy.
Many concerns have been raised by mental health experts regarding the use of LLMs for therapy.
Four LLM variants with excellent performance on natural language processing are evaluated.
arXiv Detail & Related papers (2024-07-25T03:01:47Z) - APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation [71.26755736617478]
Empathetic response generation is designed to comprehend the emotions of others.
We develop a framework that combines retrieval augmentation and emotional support strategy integration.
Our framework can enhance the empathy ability of LLMs from both cognitive and affective empathy perspectives.
arXiv Detail & Related papers (2024-07-23T02:23:37Z) - Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification.
We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z) - Acknowledgment of Emotional States: Generating Validating Responses for
Empathetic Dialogue [21.621844911228315]
This study introduces the first framework designed to engender empathetic dialogue with validating responses.
Our approach incorporates a tripartite module system: 1) validation timing detection, 2) users' emotional state identification, and 3) validating response generation.
arXiv Detail & Related papers (2024-02-20T07:20:03Z) - Chain of Empathy: Enhancing Empathetic Response of Large Language Models Based on Psychotherapy Models [2.679689033125693]
We present a novel method, the Chain of Empathy (CoE) prompting, that utilizes insights from psychotherapy to induce Large Language Models (LLMs) to reason about human emotional states.
This method is inspired by various psychotherapy approaches including Cognitive Behavioral Therapy (CBT), Dialectical Behavior Therapy (DBT), Person Centered Therapy (PCT), and Reality Therapy (RT)
arXiv Detail & Related papers (2023-11-02T02:21:39Z) - Harnessing Large Language Models' Empathetic Response Generation
Capabilities for Online Mental Health Counselling Support [1.9336815376402723]
Large Language Models (LLMs) have demonstrated remarkable performance across various information-seeking and reasoning tasks.
This study sought to examine LLMs' capability to generate empathetic responses in conversations that emulate those in a mental health counselling setting.
We selected five LLMs: version 3.5 and version 4 of the Generative Pre-training (GPT), Vicuna FastChat-T5, Pathways Language Model (PaLM) version 2, and Falcon-7B-Instruct.
arXiv Detail & Related papers (2023-10-12T03:33:06Z) - Chat2Brain: A Method for Mapping Open-Ended Semantic Queries to Brain
Activation Maps [59.648646222905235]
We propose a method called Chat2Brain that combines LLMs to basic text-2-image model, known as Text2Brain, to map semantic queries to brain activation maps.
We demonstrate that Chat2Brain can synthesize plausible neural activation patterns for more complex tasks of text queries.
arXiv Detail & Related papers (2023-09-10T13:06:45Z) - Building Emotional Support Chatbots in the Era of LLMs [64.06811786616471]
We introduce an innovative methodology that synthesizes human insights with the computational prowess of Large Language Models (LLMs)
By utilizing the in-context learning potential of ChatGPT, we generate an ExTensible Emotional Support dialogue dataset, named ExTES.
Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions.
arXiv Detail & Related papers (2023-08-17T10:49:18Z) - EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation [34.24557248359872]
We propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task.
Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU) and Skip-chain Conditional Random Field (SkipCRF)
Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets.
arXiv Detail & Related papers (2023-03-20T13:58:35Z) - A Hierarchical Regression Chain Framework for Affective Vocal Burst
Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts.
To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules.
The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z) - MAFW: A Large-scale, Multi-modal, Compound Affective Database for
Dynamic Facial Expression Recognition in the Wild [56.61912265155151]
We propose MAFW, a large-scale compound affective database with 10,045 video-audio clips in the wild.
Each clip is annotated with a compound emotional category and a couple of sentences that describe the subjects' affective behaviors in the clip.
For the compound emotion annotation, each clip is categorized into one or more of the 11 widely-used emotions, i.e., anger, disgust, fear, happiness, neutral, sadness, surprise, contempt, anxiety, helplessness, and disappointment.
arXiv Detail & Related papers (2022-08-01T13:34:33Z) - Multimodal Emotion Recognition with High-level Speech and Text Features [8.141157362639182]
We propose a novel cross-representation speech model to perform emotion recognition on wav2vec 2.0 speech features.
We also train a CNN-based model to recognize emotions from text features extracted with Transformer-based models.
Our method is evaluated on the IEMOCAP dataset in a 4-class classification problem.
arXiv Detail & Related papers (2021-09-29T07:08:40Z) - Emotion-aware Chat Machine: Automatic Emotional Response Generation for
Human-like Emotional Interaction [55.47134146639492]
This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
Experiments on real-world data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of both content coherence and emotion appropriateness.
arXiv Detail & Related papers (2021-06-06T06:26:15Z) - Target Guided Emotion Aware Chat Machine [58.8346820846765]
The consistency of a response to a given post at semantic-level and emotional-level is essential for a dialogue system to deliver human-like interactions.
This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
arXiv Detail & Related papers (2020-11-15T01:55:37Z) - Feature Fusion Strategies for End-to-End Evaluation of Cognitive
Behavior Therapy Sessions [32.198800906972366]
We develop an end-to-end pipeline that converts speech audio to diarized and transcribed text to code Cognitive Behavioral Therapy sessions automatically.
We propose a novel method to augment the word-based features with the utterance level tags for subsequent CBT code estimation.
arXiv Detail & Related papers (2020-05-15T22:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.