Related papers: Mechanistic Interpretability of Emotion Inference in Large Language Models

Mechanistic Interpretability of Emotion Inference in Large Language Models

URL: http://arxiv.org/abs/2502.05489v1
Date: Sat, 08 Feb 2025 08:11:37 GMT
Title: Mechanistic Interpretability of Emotion Inference in Large Language Models
Authors: Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, Jonathan Gratch,
Abstract summary: We show that emotion representations are functionally localized to specific regions in large language models.<n>We draw on cognitive appraisal theory to show that emotions emerge from evaluations of environmental stimuli.<n>This work highlights a novel way to causally intervene and precisely shape emotional text generation.
Score: 16.42503362001602
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) show promising capabilities in predicting human emotions from text. However, the mechanisms through which these models process emotional stimuli remain largely unexplored. Our study addresses this gap by investigating how autoregressive LLMs infer emotions, showing that emotion representations are functionally localized to specific regions in the model. Our evaluation includes diverse model families and sizes and is supported by robustness checks. We then show that the identified representations are psychologically plausible by drawing on cognitive appraisal theory, a well-established psychological framework positing that emotions emerge from evaluations (appraisals) of environmental stimuli. By causally intervening on construed appraisal concepts, we steer the generation and show that the outputs align with theoretical and intuitive expectations. This work highlights a novel way to causally intervene and precisely shape emotional text generation, potentially benefiting safety and alignment in sensitive affective domains.

Related papers

AI shares emotion with humans across languages and cultures [12.530921452568291]
We assess human-AI emotional alignment across linguistic-cultural groups and model-families.<n>Our analyses reveal that LLM-derived emotion spaces are structurally congruent with human perception.<n>We show that model expressions can be stably and naturally modulated across distinct emotion categories.
arXiv Detail & Related papers (2025-06-11T14:42:30Z)
Narrative-Centered Emotional Reflection: Scaffolding Autonomous Emotional Literacy with AI [0.0]
Reflexion is an AI-powered platform designed to enable structured emotional self-reflection at scale. System scaffolds a progressive journey from surface-level emotional recognition toward value-aligned action planning.
arXiv Detail & Related papers (2025-04-29T01:24:46Z)
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics. We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention. Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks. Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z)
"Only ChatGPT gets me": An Empirical Analysis of GPT versus other Large Language Models for Emotion Detection in Text [2.6012482282204004]
This work investigates the capabilities of large language models (LLMs) in detecting and understanding human emotions through text. By employing a methodology that involves comparisons with a state-of-the-art model on the GoEmotions dataset, we aim to gauge LLMs' effectiveness as a system for emotional analysis.
arXiv Detail & Related papers (2025-03-05T09:47:49Z)
MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis [53.012111671763776]
This study introduces MEMO-Bench, a comprehensive benchmark consisting of 7,145 portraits, each depicting one of six different emotions. Results demonstrate that existing T2I models are more effective at generating positive emotions than negative ones. Although MLLMs show a certain degree of effectiveness in distinguishing and recognizing human emotions, they fall short of human-level accuracy.
arXiv Detail & Related papers (2024-11-18T02:09:48Z)
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains [61.50113532215864]
Causal Emotion Entailment (CEE) aims to identify the causal utterances in a conversation that stimulate the emotions expressed in a target utterance. Current works in CEE mainly focus on modeling semantic and emotional interactions in conversations. We introduce a step-by-step reasoning method, Emotion-Cause Reasoning Chain (ECR-Chain), to infer the stimulus from the target emotional expressions in conversations.
arXiv Detail & Related papers (2024-05-17T15:45:08Z)
Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks. We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z)
An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents [0.40964539027092906]
This study tests the capabilities of large language models to solve emotional intelligence tasks and to simulate emotions. It presents and evaluates a new chain-of-emotion architecture for emotion simulation within video games, based on psychological appraisal research.
arXiv Detail & Related papers (2023-09-10T16:55:49Z)
Language-Specific Representation of Emotion-Concept Knowledge Causally Supports Emotion Inference [44.126681295827794]
This study used a form of artificial intelligence known as large language models (LLMs) to assess whether language-based representations of emotion causally contribute to the AI's ability to generate inferences about the emotional meaning of novel situations. Our findings provide a proof-in-concept that even a LLM can learn about emotions in the absence of sensory-motor representations and highlight the contribution of language-derived emotion-concept knowledge for emotion inference.
arXiv Detail & Related papers (2023-02-19T14:21:33Z)
A Circular-Structured Representation for Visual Emotion Distribution Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning. To be specific, we first construct an Emotion Circle to unify any emotional state within it. On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z)
Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions. Our framework integrates a contextualized embedding encoder with a multi-head probing model. Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.