Related papers: SynPAIN: A Synthetic Dataset of Pain and Non-Pain Facial Expressions

SynPAIN: A Synthetic Dataset of Pain and Non-Pain Facial Expressions

URL: http://arxiv.org/abs/2507.19673v2
Date: Fri, 01 Aug 2025 17:06:27 GMT
Title: SynPAIN: A Synthetic Dataset of Pain and Non-Pain Facial Expressions
Authors: Babak Taati, Muhammad Muzammil, Yasamin Zarghami, Abhishek Moturu, Amirhossein Kazerouni, Hailey Reimer, Alex Mihailidis, Thomas Hadjistavropoulos,
Abstract summary: Existing pain detection datasets suffer from limited ethnic/racial diversity, privacy constraints, and underrepresentation of older adults.<n>We present SynPAIN, a large-scale synthetic dataset containing 10,710 facial expression images.<n>Using commercial generative AI tools, we created demographically balanced synthetic identities with clinically meaningful pain expressions.
Score: 3.0806468055954737
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate pain assessment in patients with limited ability to communicate, such as older adults with dementia, represents a critical healthcare challenge. Robust automated systems of pain detection may facilitate such assessments. Existing pain detection datasets, however, suffer from limited ethnic/racial diversity, privacy constraints, and underrepresentation of older adults who are the primary target population for clinical deployment. We present SynPAIN, a large-scale synthetic dataset containing 10,710 facial expression images (5,355 neutral/expressive pairs) across five ethnicities/races, two age groups (young: 20-35, old: 75+), and two genders. Using commercial generative AI tools, we created demographically balanced synthetic identities with clinically meaningful pain expressions. Our validation demonstrates that synthetic pain expressions exhibit expected pain patterns, scoring significantly higher than neutral and non-pain expressions using clinically validated pain assessment tools based on facial action unit analysis. We experimentally demonstrate SynPAIN's utility in identifying algorithmic bias in existing pain detection models. Through comprehensive bias evaluation, we reveal substantial performance disparities across demographic characteristics. These performance disparities were previously undetectable with smaller, less diverse datasets. Furthermore, we demonstrate that age-matched synthetic data augmentation improves pain detection performance on real clinical data, achieving a 7.0% improvement in average precision. SynPAIN addresses critical gaps in pain assessment research by providing the first publicly available, demographically diverse synthetic dataset specifically designed for older adult pain detection, while establishing a framework for measuring and mitigating algorithmic bias. The dataset is available at https://doi.org/10.5683/SP3/WCXMAP

Related papers

Improving Pain Classification using Spatio-Temporal Deep Learning Approaches with Facial Expressions [0.27309692684728604]
Pain management and severity detection are crucial for effective treatment.<n>Traditional self-reporting methods are subjective and may be unsuitable for non-verbal individuals.<n>We explore automated pain detection using facial expressions.
arXiv Detail & Related papers (2025-01-12T11:54:46Z)
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data [44.304022773272415]
We introduce SynFER, a novel framework for synthesizing facial expression image data based on high-level textual descriptions. We propose a semantic guidance technique to steer the generation process and a pseudo-label generator to help rectify the facial expression labels. Our approach achieves a 67.23% classification accuracy on AffectNet when training solely with synthetic data equivalent to the AffectNet training set size.
arXiv Detail & Related papers (2024-10-13T14:58:21Z)
Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z)
Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z)
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information. A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction. The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z)
Automated facial recognition system using deep learning for pain assessment in adults with cerebral palsy [0.5242869847419834]
Existing measures, relying on direct observation by caregivers, lack sensitivity and specificity. Ten neural networks were trained on three pain image databases. InceptionV3 exhibited promising performance on the CP-PAIN dataset.
arXiv Detail & Related papers (2024-01-22T17:55:16Z)
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging [16.146223377936035]
We introduce the Adaptive temporal Dynamic Image (AHDI) technique. AHDI encodes deep changes in facial videos into singular RGB image, permitting application simpler 2D models for video representation. Within this framework, we employ a residual network to derive generalized facial representations. These representations are optimized for two tasks: estimating pain intensity and differentiating between genuine and simulated pain expressions.
arXiv Detail & Related papers (2023-12-12T01:23:05Z)
Wearable-based Fair and Accurate Pain Assessment Using Multi-Attribute Fairness Loss in Convolutional Neural Networks [4.451479907610764]
The adoption of AI in clinical pain evaluation is hindered by challenges like personalization and fairness.<n>Many AI models, including machine and deep learning, exhibit biases, discriminating against specific groups based on gender or ethnicity.<n>This paper proposes a Multi-attribute Fairness Loss (MAFL) based Convolutional Neural Network (CNN) model designed to account for protected attributes in data.
arXiv Detail & Related papers (2023-07-03T09:21:36Z)
Pain level and pain-related behaviour classification using GRU-based sparsely-connected RNNs [61.080598804629375]
People with chronic pain unconsciously adapt specific body movements to protect themselves from injury or additional pain. Because there is no dedicated benchmark database to analyse this correlation, we considered one of the specific circumstances that potentially influence a person's biometrics during daily activities. We proposed a sparsely-connected recurrent neural networks (s-RNNs) ensemble with the gated recurrent unit (GRU) that incorporates multiple autoencoders. We conducted several experiments which indicate that the proposed method outperforms the state-of-the-art approaches in classifying both pain level and pain-related behaviour.
arXiv Detail & Related papers (2022-12-20T12:56:28Z)
Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records. We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data. We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z)
Intelligent Sight and Sound: A Chronic Cancer Pain Dataset [74.77784420691937]
This paper introduces the first chronic cancer pain dataset, collected as part of the Intelligent Sight and Sound (ISS) clinical trial. The data collected to date consists of 29 patients, 509 smartphone videos, 189,999 frames, and self-reported affective and activity pain scores. Using static images and multi-modal data to predict self-reported pain levels, early models show significant gaps between current methods available to predict pain.
arXiv Detail & Related papers (2022-04-07T22:14:37Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.