Synthetic Thermal and RGB Videos for Automatic Pain Assessment utilizing a Vision-MLP Architecture
- URL: http://arxiv.org/abs/2407.19811v1
- Date: Mon, 29 Jul 2024 09:04:11 GMT
- Title: Synthetic Thermal and RGB Videos for Automatic Pain Assessment utilizing a Vision-MLP Architecture
- Authors: Stefanos Gkikas, Manolis Tsiknakis,
- Abstract summary: This study presents synthetic thermal videos generated by Generative Adversarial Networks integrated into the pain recognition pipeline.
A framework consisting of a Vision-MLP and a Transformer-based module is utilized, employing RGB and synthetic thermal videos in unimodal and multimodal settings.
- Score: 0.9668407688201359
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pain assessment is essential in developing optimal pain management protocols to alleviate suffering and prevent functional decline in patients. Consequently, reliable and accurate automatic pain assessment systems are essential for continuous and effective patient monitoring. This study presents synthetic thermal videos generated by Generative Adversarial Networks integrated into the pain recognition pipeline and evaluates their efficacy. A framework consisting of a Vision-MLP and a Transformer-based module is utilized, employing RGB and synthetic thermal videos in unimodal and multimodal settings. Experiments conducted on facial videos from the BioVid database demonstrate the effectiveness of synthetic thermal videos and underline the potential advantages of it.
Related papers
- A Full Transformer-based Framework for Automatic Pain Estimation using Videos [0.9668407688201359]
We present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks.
We demonstrate state-of-the-art performances, showing the efficacy, efficiency, and generalization capability across all the primary pain estimation tasks.
arXiv Detail & Related papers (2024-12-19T17:45:08Z) - Towards Synthetic Data Generation for Improved Pain Recognition in Videos under Patient Constraints [11.515273901289472]
This study introduces a novel approach that leverages synthetic data to enhance video-based pain recognition models.
We present a pipeline that synthesizes realistic 3D facial models by capturing nuanced facial movements from a small participant pool.
This process generates 8,600 synthetic faces, accurately reflecting genuine pain expressions from varied angles and perspectives.
arXiv Detail & Related papers (2024-09-24T18:33:57Z) - Transformer with Leveraged Masked Autoencoder for video-based Pain Assessment [11.016004057765185]
We enhance pain recognition by employing facial video analysis within a Transformer-based deep learning model.
By combining a powerful Masked Autoencoder with a Transformers-based classifier, our model effectively captures pain level indicators through both expressions and micro-expressions.
arXiv Detail & Related papers (2024-09-08T13:14:03Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Learning to Estimate Critical Gait Parameters from Single-View RGB
Videos with Transformer-Based Attention Network [0.0]
This paper introduces a novel Transformer network to estimate critical gait parameters from RGB videos captured by a single-view camera.
Empirical evaluations on a public dataset of cerebral palsy patients indicate that the proposed framework surpasses current state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-01T07:45:27Z) - MC-ViViT: Multi-branch Classifier-ViViT to detect Mild Cognitive
Impairment in older adults using facial videos [44.72781467904852]
This paper proposes a novel Multi-branch-Video Vision Transformer (MCViViT) model to distinguish from those with normal cognition by analyzing facial features.
The data comes from the I-CONECT, a behavioral intervention trial aimed at improving cognitive function by providing frequent video chats.
Our experimental results on I-CONECT dataset show the great potential of MC-ViViT in predicting MCI with a high accuracy of 90.63%.
arXiv Detail & Related papers (2023-04-11T15:42:20Z) - Tele-EvalNet: A Low-cost, Teleconsultation System for Home based
Rehabilitation of Stroke Survivors using Multiscale CNN-LSTM Architecture [7.971065005161566]
We propose Tele-EvalNet, a novel system consisting of two components: a live feedback model and an overall performance evaluation model.
The live feedback model demonstrates feedback on exercise correctness with easy to understand instructions highlighted using color markers.
The overall performance evaluation model learns a mapping of joint data to scores, given to the performance by clinicians.
arXiv Detail & Related papers (2021-12-06T16:58:00Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - Two-Stream Deep Feature Modelling for Automated Video Endoscopy Data
Analysis [45.19890687786009]
We propose a two-stream model for endoscopic image analysis.
Our model fuses two streams of deep feature inputs by mapping their inherent relations through a novel relational network model.
arXiv Detail & Related papers (2020-07-12T05:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.