A Hybrid Deep Learning Framework for Emotion Recognition in Children with Autism During NAO Robot-Mediated Interaction
- URL: http://arxiv.org/abs/2512.12208v1
- Date: Sat, 13 Dec 2025 06:40:01 GMT
- Title: A Hybrid Deep Learning Framework for Emotion Recognition in Children with Autism During NAO Robot-Mediated Interaction
- Authors: Indranil Bhattacharjee, Vartika Narayani Srinet, Anirudha Bhattacharjee, Braj Bhushan, Bishakh Bhattacharya,
- Abstract summary: This study presents a novel deep learning pipeline for emotion recognition in autistic children in response to a name-calling event by a humanoid robot.<n>The dataset comprises of around 50,000 facial frames extracted from video recordings of 15 children with ASD.
- Score: 0.6524460254566904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding emotional responses in children with Autism Spectrum Disorder (ASD) during social interaction remains a critical challenge in both developmental psychology and human-robot interaction. This study presents a novel deep learning pipeline for emotion recognition in autistic children in response to a name-calling event by a humanoid robot (NAO), under controlled experimental settings. The dataset comprises of around 50,000 facial frames extracted from video recordings of 15 children with ASD. A hybrid model combining a fine-tuned ResNet-50-based Convolutional Neural Network (CNN) and a three-layer Graph Convolutional Network (GCN) trained on both visual and geometric features extracted from MediaPipe FaceMesh landmarks. Emotions were probabilistically labeled using a weighted ensemble of two models: DeepFace's and FER, each contributing to soft-label generation across seven emotion classes. Final classification leveraged a fused embedding optimized via Kullback-Leibler divergence. The proposed method demonstrates robust performance in modeling subtle affective responses and offers significant promise for affective profiling of ASD children in clinical and therapeutic human-robot interaction contexts, as the pipeline effectively captures micro emotional cues in neurodivergent children, addressing a major gap in autism-specific HRI research. This work represents the first such large-scale, real-world dataset and pipeline from India on autism-focused emotion analysis using social robotics, contributing an essential foundation for future personalized assistive technologies.
Related papers
- CAST-Phys: Contactless Affective States Through Physiological signals Database [74.28082880875368]
The lack of affective multi-modal datasets remains a major bottleneck in developing accurate emotion recognition systems.<n>We present the Contactless Affective States Through Physiological Signals Database (CAST-Phys), a novel high-quality dataset capable of remote physiological emotion recognition.<n>Our analysis highlights the crucial role of physiological signals in realistic scenarios where facial expressions alone may not provide sufficient emotional information.
arXiv Detail & Related papers (2025-07-08T15:20:24Z) - Enhancing Autism Spectrum Disorder Early Detection with the Parent-Child Dyads Block-Play Protocol and an Attention-enhanced GCN-xLSTM Hybrid Deep Learning Framework [6.785167067600156]
This work proposes a novel Parent-Child Dyads Block-Play (PCB) protocol to identify behavioral patterns distinguishing ASD from typically developing toddlers.
We have compiled a substantial video dataset, featuring 40 ASD and 89 TD toddlers engaged in block play with parents.
This dataset exceeds previous efforts on both the scale of participants and the length of individual sessions.
arXiv Detail & Related papers (2024-08-29T21:53:01Z) - Hybrid Models for Facial Emotion Recognition in Children [0.0]
This paper focuses on the use of emotion recognition techniques to assist psychologists in performing children's therapy through remotely robot operated sessions.
Embodied Conversational Agents (ECA) as an intermediary tool can help professionals connect with children who face social challenges.
arXiv Detail & Related papers (2023-08-24T04:20:20Z) - A Hierarchical Regression Chain Framework for Affective Vocal Burst
Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts.
To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules.
The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z) - Vision-Based Activity Recognition in Children with Autism-Related
Behaviors [15.915410623440874]
We demonstrate the effect of a region-based computer vision system to help clinicians and parents analyze a child's behavior.
The data is pre-processed by detecting the target child in the video to reduce the impact of background noise.
Motivated by the effectiveness of temporal convolutional models, we propose both light-weight and conventional models capable of extracting action features from video frames.
arXiv Detail & Related papers (2022-08-08T15:12:27Z) - Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration.
We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions.
The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z) - Exploring the pattern of Emotion in children with ASD as an early
biomarker through Recurring-Convolution Neural Network (R-CNN) [0.0]
The paper implements in identifying basic facial expression and exploring their emotions upon a time variant factor.
The emotions are analyzed by incorporating the facial expression identified through CNN using 68 landmark points plotted on the frontal face with a prediction network formed by RNN known as RCNN-FER system.
arXiv Detail & Related papers (2021-12-30T09:35:05Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - A Two-stage Multi-modal Affect Analysis Framework for Children with
Autism Spectrum Disorder [3.029434408969759]
We present an open-source two-stage multi-modal approach leveraging acoustic and visual cues to predict three main affect states of children with ASD's affect states in real-world play therapy scenarios.
This work presents a novel way to combine human expertise and machine intelligence for ASD affect recognition by proposing a two-stage schema.
arXiv Detail & Related papers (2021-06-17T01:28:53Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.