Pose-based Body Language Recognition for Emotion and Psychiatric Symptom
Interpretation
- URL: http://arxiv.org/abs/2011.00043v1
- Date: Fri, 30 Oct 2020 18:45:16 GMT
- Title: Pose-based Body Language Recognition for Emotion and Psychiatric Symptom
Interpretation
- Authors: Zhengyuan Yang, Amanda Kay, Yuncheng Li, Wendi Cross, Jiebo Luo
- Abstract summary: We propose an automated framework for body language based emotion recognition starting from regular RGB videos.
In collaboration with psychologists, we extend the framework for psychiatric symptom prediction.
Because a specific application domain of the proposed framework may only supply a limited amount of data, the framework is designed to work on a small training set.
- Score: 75.3147962600095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by the human ability to infer emotions from body language, we
propose an automated framework for body language based emotion recognition
starting from regular RGB videos. In collaboration with psychologists, we
further extend the framework for psychiatric symptom prediction. Because a
specific application domain of the proposed framework may only supply a limited
amount of data, the framework is designed to work on a small training set and
possess a good transferability. The proposed system in the first stage
generates sequences of body language predictions based on human poses estimated
from input videos. In the second stage, the predicted sequences are fed into a
temporal network for emotion interpretation and psychiatric symptom prediction.
We first validate the accuracy and transferability of the proposed body
language recognition method on several public action recognition datasets. We
then evaluate the framework on a proposed URMC dataset, which consists of
conversations between a standardized patient and a behavioral health
professional, along with expert annotations of body language, emotions, and
potential psychiatric symptoms. The proposed framework outperforms other
methods on the URMC dataset.
Related papers
- Graph Modelling Analysis of Speech-Gesture Interaction for Aphasia Severity Estimation [0.0]
Aphasia is an acquired language disorder caused by injury to the regions of the brain that are responsible for language.<n>Recent advancements in speech analysis focus on automated estimation of aphasia severity from spontaneous speech.<n>In this work, we propose a graph neural network-based framework for estimating aphasia severity.
arXiv Detail & Related papers (2026-01-27T14:11:36Z) - E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis [54.763420895859035]
We present ELLM2-EEG-to-Emotion Large Language Model, first MLLM framework for interpretable emotion analysis from EEG.<n>ELLM integrates a pretrained EEG encoder with Q-based LLMs through learnable projection layers, employing a multi-stage training pipeline.<n>Experiments on the dataset across seven emotion categories demonstrate that ELLM2-EEG-to-Emotion Large Language Model achieves excellent performance on emotion classification.
arXiv Detail & Related papers (2026-01-11T13:21:20Z) - Interpretable Neuropsychiatric Diagnosis via Concept-Guided Graph Neural Networks [56.75602443936853]
One in five adolescents currently live with a diagnosed mental or behavioral health condition, such as anxiety, depression, or conduct disorder.<n>While prior works use graph neural network (GNN) approaches for disorder prediction, they remain black-boxes, limiting their reliability and clinical translation.<n>In this work, we propose a concept-based diagnosis framework that that encodes interpretable functional connectivity concepts.<n>Our design ensures predictions through clinically meaningful connectivity patterns, enabling both interpretability and strong predictive performance.
arXiv Detail & Related papers (2025-10-02T19:38:46Z) - E-THER: A Multimodal Dataset for Empathic AI - Towards Emotional Mismatch Awareness [3.8298581733964903]
E-THER is the first Person-Centered Therapy-grounded multimodal dataset with multidimensional annotations for verbal-visual incongruence detection.<n>We show that our incongruence-trained models outperform general-purpose models in critical traits.
arXiv Detail & Related papers (2025-09-02T08:58:32Z) - Emotion Recognition from Skeleton Data: A Comprehensive Survey [13.443333210819555]
Emotion recognition through body movements has emerged as a compelling and privacy-preserving alternative to traditional methods.<n>Recent advancements in 3D skeleton acquisition technologies and pose estimation algorithms have significantly enhanced the feasibility of emotion recognition based on full-body motion.
arXiv Detail & Related papers (2025-07-24T01:58:57Z) - Early Detection of Mental Health Issues Using Social Media Posts [0.0]
Social media platforms, like Reddit, represent a rich source of user-generated content.
We propose a multi-modal deep learning framework that integrates linguistic and temporal features for early detection of mental health crises.
arXiv Detail & Related papers (2025-03-06T23:08:08Z) - A Multimodal Emotion Recognition System: Integrating Facial Expressions, Body Movement, Speech, and Spoken Language [0.0]
This work presents a multimodal emotion recognition system that provides a standardised, objective, and data-driven tool to support evaluators.
The system integrates recognition of facial expressions, speech, spoken language, and body movement analysis to capture subtle emotional cues that are often overlooked in human evaluations.
arXiv Detail & Related papers (2024-12-23T19:00:34Z) - EEG Emotion Copilot: Pruning LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation [13.048477440429195]
This paper presents the EEG Emotion Copilot, a system leveraging a lightweight large language model (LLM) operating in a local setting.
The system is designed to first recognize emotional states directly from EEG signals, subsequently generate personalized diagnostic and treatment suggestions.
Privacy concerns are also addressed, with a focus on ethical data collection, processing, and the protection of users' personal information.
arXiv Detail & Related papers (2024-09-30T19:15:05Z) - Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification.
We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z) - Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z) - Acknowledgment of Emotional States: Generating Validating Responses for
Empathetic Dialogue [21.621844911228315]
This study introduces the first framework designed to engender empathetic dialogue with validating responses.
Our approach incorporates a tripartite module system: 1) validation timing detection, 2) users' emotional state identification, and 3) validating response generation.
arXiv Detail & Related papers (2024-02-20T07:20:03Z) - KNSE: A Knowledge-aware Natural Language Inference Framework for
Dialogue Symptom Status Recognition [69.78432481474572]
We propose a novel framework called KNSE for symptom status recognition (SSR)
For each mentioned symptom in a dialogue window, we first generate knowledge about the symptom and hypothesis about status of the symptom, to form a (premise, knowledge, hypothesis) triplet.
The BERT model is then used to encode the triplet, which is further processed by modules including utterance aggregation, self-attention, cross-attention, and GRU to predict the symptom status.
arXiv Detail & Related papers (2023-05-26T11:23:26Z) - Bodily Behaviors in Social Interaction: Novel Annotations and
State-of-the-Art Evaluation [0.0]
We present BBSI, the first set of annotations of complex Bodily Behaviors embedded in continuous Social Interactions.
Based on previous work in psychology, we manually annotated 26 hours of spontaneous human behavior.
We adapt the Pyramid Dilated Attention Network (PDAN), a state-of-the-art approach for human action detection.
arXiv Detail & Related papers (2022-07-26T11:24:00Z) - Learning Personal Representations from fMRIby Predicting Neurofeedback
Performance [52.77024349608834]
We present a deep neural network method for learning a personal representation for individuals performing a self neuromodulation task, guided by functional MRI (fMRI)
The representation is learned by a self-supervised recurrent neural network, that predicts the Amygdala activity in the next fMRI frame given recent fMRI frames and is conditioned on the learned individual representation.
arXiv Detail & Related papers (2021-12-06T10:16:54Z) - Automated Quality Assessment of Cognitive Behavioral Therapy Sessions
Through Highly Contextualized Language Representations [34.670548892766625]
A BERT-based model is proposed for automatic behavioral scoring of a specific type of psychotherapy, called Cognitive Behavioral Therapy (CBT)
The model is trained in a multi-task manner in order to achieve higher interpretability.
BERT-based representations are further augmented with available therapy metadata, providing relevant non-linguistic context and leading to consistent performance improvements.
arXiv Detail & Related papers (2021-02-23T09:22:29Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.