Related papers: Efficient Pain Recognition via Respiration Signals: A Single Cross-Attention Transformer Multi-Window Fusion Pipeline

Efficient Pain Recognition via Respiration Signals: A Single Cross-Attention Transformer Multi-Window Fusion Pipeline

URL: http://arxiv.org/abs/2507.21886v4
Date: Thu, 07 Aug 2025 16:25:19 GMT
Title: Efficient Pain Recognition via Respiration Signals: A Single Cross-Attention Transformer Multi-Window Fusion Pipeline
Authors: Stefanos Gkikas, Ioannis Kyprakis, Manolis Tsiknakis,
Abstract summary: This study has been submitted to the textitSecond Multimodal Sensing Grand Challenge for Next-Gen Pain Assessment (AI4PAIN).
Score: 0.8602553195689511
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pain is a complex condition affecting a large portion of the population. Accurate and consistent evaluation is essential for individuals experiencing pain, and it supports the development of effective and advanced management strategies. Automatic pain assessment systems provide continuous monitoring and support clinical decision-making, aiming to reduce distress and prevent functional decline. This study has been submitted to the \textit{Second Multimodal Sensing Grand Challenge for Next-Gen Pain Assessment (AI4PAIN)}. The proposed method introduces a pipeline that leverages respiration as the input signal and incorporates a highly efficient cross-attention transformer alongside a multi-windowing strategy. Extensive experiments demonstrate that respiration is a valuable physiological modality for pain assessment. Moreover, experiments revealed that compact and efficient models, when properly optimized, can achieve strong performance, often surpassing larger counterparts. The proposed multi-window approach effectively captures both short-term and long-term features, as well as global characteristics, thereby enhancing the model's representational capacity.

Related papers

Multi-Representation Diagrams for Pain Recognition: Integrating Various Electrodermal Activity Signals into a Single Image [0.8602553195689511]
This study has been submitted to the textitSecond Multimodal Sensing Grand Challenge for Next-Gen Pain Assessment (AI4PAIN).
arXiv Detail & Related papers (2025-07-29T14:53:28Z)
Tiny-BioMoE: a Lightweight Embedding Model for Biosignal Analysis [0.8602553195689511]
This study has been submitted to the textitSecond Multimodal Sensing Grand Challenge for Next-Gen Pain Assessment (AI4PAIN).<n>The proposed approach introduces textitTiny-BioMoE, a lightweight pretrained embedding model for biosignal analysis.
arXiv Detail & Related papers (2025-07-29T14:46:39Z)
Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement [101.77467538102924]
Large reasoning models (LRMs) exhibit overthinking, which hinders efficiency and inflates inference cost.<n>We propose two lightweight methods to enhance LRM efficiency.<n>First, we introduce Efficiency Steering, a training-free activation steering technique that modulates reasoning behavior via a single direction.<n>Second, we develop Self-Rewarded Efficiency RL, a reinforcement learning framework that dynamically balances task accuracy and brevity.
arXiv Detail & Related papers (2025-06-18T17:18:12Z)
PainFormer: a Vision Foundation Model for Automatic Pain Assessment [2.8028723950211476]
Pain is a manifold condition that impacts a significant percentage of the population.<n>This study introduces PainFormer, a vision foundation model based on multi-task learning principles.<n>PainFormer effectively extracts high-quality embeddings from diverse input modalities.
arXiv Detail & Related papers (2025-05-02T20:29:27Z)
Twins-PainViT: Towards a Modality-Agnostic Vision Transformer Framework for Multimodal Automatic Pain Assessment using Facial Videos and fNIRS [0.9668407688201359]
This study has been submitted to the First Multimodal Sensing Grand Challenge for Next-Gen Pain Assessment (AI4PAIN) The proposed multimodal framework utilizes facial videos and fNIRS and presents a modality-agnostic approach, alleviating the need for domain-specific models.
arXiv Detail & Related papers (2024-07-29T09:02:43Z)
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information. A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction. The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z)
Automatic diagnosis of knee osteoarthritis severity using Swin transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint. We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z)
Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms. It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL. This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z)
Multimodal Spatio-Temporal Deep Learning Approach for Neonatal Postoperative Pain Assessment [3.523040451502402]
Current practice for assessing neonatal postoperative pain is subjective, inconsistent, slow and discontinuous. We present a novel multimodal-temporal approach that integrates visual and vocal signals and uses them for assessing neonatal postoperative pain.
arXiv Detail & Related papers (2020-12-03T18:52:35Z)
Finding Action Tubes with a Sparse-to-Dense Framework [62.60742627484788]
We propose a framework that generates action tube proposals from video streams with a single forward pass in a sparse-to-dense manner. We evaluate the efficacy of our model on the UCF101-24, JHMDB-21 and UCFSports benchmark datasets.
arXiv Detail & Related papers (2020-08-30T15:38:44Z)
Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition [133.35968094967626]
Skeleton-based action recognition has attracted increasing attention due to its strong adaptability to dynamic circumstances. With the help of deep learning techniques, it has also witnessed substantial progress and currently achieved around 90% accuracy in benign environment. Research on the vulnerability of skeleton-based action recognition under different adversarial settings remains scant.
arXiv Detail & Related papers (2020-05-14T17:12:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.