Affective Multimodal Agents with Proactive Knowledge Grounding for Emotionally Aligned Marketing Dialogue
- URL: http://arxiv.org/abs/2511.21728v1
- Date: Fri, 21 Nov 2025 04:16:45 GMT
- Title: Affective Multimodal Agents with Proactive Knowledge Grounding for Emotionally Aligned Marketing Dialogue
- Authors: Lin Yu, Xiaofei Han, Yifei Kang, Chiung-Yi Tseng, Danyang Zhang, Ziqian Bi, Zhimo Han,
- Abstract summary: AffectMind is a multimodal affective dialogue agent that performs proactive reasoning and dynamic knowledge grounding to sustain emotionally aligned and persuasive interactions.<n>Experiments show that AffectMind outperforms strong LLM-based baselines in emotional consistency, persuasive success rate, and long-term user engagement.
- Score: 3.780355670921318
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in large language models (LLMs) have enabled fluent dialogue systems, but most remain reactive and struggle in emotionally rich, goal-oriented settings such as marketing conversations. To address this limitation, we propose AffectMind, a multimodal affective dialogue agent that performs proactive reasoning and dynamic knowledge grounding to sustain emotionally aligned and persuasive interactions. AffectMind combines three components: a Proactive Knowledge Grounding Network (PKGN) that continuously updates factual and affective context from text, vision, and prosody; an Emotion--Intent Alignment Model (EIAM) that jointly models user emotion and purchase intent to adapt persuasion strategies; and a Reinforced Discourse Loop (RDL) that optimizes emotional coherence and engagement via reinforcement signals from user responses. Experiments on two newly curated marketing dialogue datasets, MM-ConvMarket and AffectPromo, show that AffectMind outperforms strong LLM-based baselines in emotional consistency (+26\%), persuasive success rate (+19\%), and long-term user engagement (+23\%), highlighting emotion-grounded proactivity as a key capability for commercial multimodal agents.
Related papers
- Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier [53.55996102181836]
We propose the Emotional Rationale Verifier (ERV) and an Explanation Reward.<n>Our method guides the model to produce reasoning that is explicitly consistent with the target emotion.<n>We show that our approach not only enhances alignment between explanation and prediction but also empowers MLLMs to deliver emotionally coherent, trustworthy interactions.
arXiv Detail & Related papers (2025-10-27T16:40:17Z) - EvoEmo: Towards Evolved Emotional Policies for Adversarial LLM Agents in Multi-Turn Price Negotiation [61.627248012799704]
Existing Large Language Models (LLMs) agents largely overlook the functional role of emotions in such negotiations.<n>We present EvoEmo, an evolutionary reinforcement learning framework that optimize dynamic emotional expression in negotiations.
arXiv Detail & Related papers (2025-09-04T15:23:58Z) - EmoDebt: Bayesian-Optimized Emotional Intelligence for Strategic Agent-to-Agent Debt Recovery [65.30120701878582]
Large Language Model (LLM) agents are vulnerable to exploitation in emotion-sensitive domains like debt collection.<n>EmoDebt is an emotional intelligence engine that reframes a model's ability to express emotion in negotiation as a sequential decision-making problem.<n>EmoDebt achieves significant strategic robustness, substantially outperforming non-adaptive and emotion-agnostic baselines.
arXiv Detail & Related papers (2025-03-27T01:41:34Z) - From Personas to Talks: Revisiting the Impact of Personas on LLM-Synthesized Emotional Support Conversations [34.426199139914615]
Large Language Models (LLMs) have revolutionized the generation of emotional support conversations.<n>This paper explores the role of personas in the creation of emotional support conversations.
arXiv Detail & Related papers (2025-02-17T05:24:30Z) - AV-EmoDialog: Chat with Audio-Visual Users Leveraging Emotional Cues [37.96886343501444]
We present AV-EmoDialog, a dialogue system designed to exploit verbal and non-verbal information from users' audio-visual inputs to generate more responsive and empathetic interactions.<n> AV-EmoDialog systematically exploits the emotional cues in audio-visual dialogues; extracting speech content and emotional tones from speech, analyzing fine-grained facial expressions from visuals, and integrating these cues to generate emotionally aware responses in an end-to-end manner.
arXiv Detail & Related papers (2024-12-23T05:24:26Z) - SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support Agent [27.301608019492043]
Large Language Models (LLMs) have demonstrated promising potential in providing empathetic support during interactions.<n>We propose an innovative strategy-enhanced role-playing framework, designed to simulate authentic emotional support conversations.<n>Within this framework, we develop the textbfServeForEmo dataset, comprising an extensive collection of 3.7K+ multi-turn dialogues and 62.8K+ utterances.
arXiv Detail & Related papers (2024-12-11T13:56:04Z) - Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - Building Emotional Support Chatbots in the Era of LLMs [64.06811786616471]
We introduce an innovative methodology that synthesizes human insights with the computational prowess of Large Language Models (LLMs)
By utilizing the in-context learning potential of ChatGPT, we generate an ExTensible Emotional Support dialogue dataset, named ExTES.
Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions.
arXiv Detail & Related papers (2023-08-17T10:49:18Z) - Facilitating Multi-turn Emotional Support Conversation with Positive
Emotion Elicitation: A Reinforcement Learning Approach [58.88422314998018]
Emotional support conversation (ESC) aims to provide emotional support (ES) to improve one's mental state.
Existing works stay at fitting grounded responses and responding strategies which ignore the effect on ES and lack explicit goals to guide emotional positive transition.
We introduce a new paradigm to formalize multi-turn ESC as a process of positive emotion elicitation.
arXiv Detail & Related papers (2023-07-16T09:58:44Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z) - Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation [39.747587984500406]
This paper presents a novel task of empathetic dialog generation with positive emotion elicitation.
The agent conducts empathetic responses along with the target of eliciting the user's positive emotions in the multi-turn dialog.
We collect a large-scale emotional dialog dataset with positive emotion elicitation, called PosEmoDial.
arXiv Detail & Related papers (2022-04-22T05:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.