Related papers: An Interpretable Transformer-Based Foundation Model for Cross-Procedural Skill Assessment Using Raw fNIRS Signals

An Interpretable Transformer-Based Foundation Model for Cross-Procedural Skill Assessment Using Raw fNIRS Signals

URL: http://arxiv.org/abs/2506.22476v1
Date: Sat, 21 Jun 2025 18:30:58 GMT
Title: An Interpretable Transformer-Based Foundation Model for Cross-Procedural Skill Assessment Using Raw fNIRS Signals
Authors: A. Subedi, S. De, L. Cavuoto, S. Schwaitzberg, M. Hackett, J. Norfleet,
Abstract summary: We introduce an interpretable transformer-based foundation model trained on minimally processed fNIRS signals for cross-procedural skill assessment.<n>The model achieves greater than 88% classification accuracy on all tasks, with Matthews Correlation Coefficient exceeding 0.91 on ETI.<n>It generalizes to a novel emergency airway procedure--cricothyrotomy--using fewer than 30 labeled samples and a lightweight (less than 2k parameter) adapter module.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Objective skill assessment in high-stakes procedural environments requires models that not only decode underlying cognitive and motor processes but also generalize across tasks, individuals, and experimental contexts. While prior work has demonstrated the potential of functional near-infrared spectroscopy (fNIRS) for evaluating cognitive-motor performance, existing approaches are often task-specific, rely on extensive preprocessing, and lack robustness to new procedures or conditions. Here, we introduce an interpretable transformer-based foundation model trained on minimally processed fNIRS signals for cross-procedural skill assessment. Pretrained using self-supervised learning on data from laparoscopic surgical tasks and endotracheal intubation (ETI), the model achieves greater than 88% classification accuracy on all tasks, with Matthews Correlation Coefficient exceeding 0.91 on ETI. It generalizes to a novel emergency airway procedure--cricothyrotomy--using fewer than 30 labeled samples and a lightweight (less than 2k parameter) adapter module, attaining an AUC greater than 87%. Interpretability is achieved via a novel channel attention mechanism--developed specifically for fNIRS--that identifies functionally coherent prefrontal sub-networks validated through ablation studies. Temporal attention patterns align with task-critical phases and capture stress-induced changes in neural variability, offering insight into dynamic cognitive states.

Related papers

Parameterized Diffusion Optimization enabled Autoregressive Ordinal Regression for Diabetic Retinopathy Grading [53.11883409422728]
This work proposes a novel autoregressive ordinal regression method called AOR-DR.<n>We decompose the diabetic retinopathy grading task into a series of ordered steps by fusing the prediction of the previous steps with extracted image features.<n>We exploit the diffusion process to facilitate conditional probability modeling, enabling the direct use of continuous global image features for autoregression.
arXiv Detail & Related papers (2025-07-07T13:22:35Z)
End-to-End Deep Learning for Real-Time Neuroimaging-Based Assessment of Bimanual Motor Skills [1.710146779965826]
This study presents a novel end-to-end deep learning framework that processes raw fNIRS signals directly.<n>It achieved a mean classification accuracy of 93.9% (SD 4.4) and a generalization accuracy of 92.6% (SD 1.9) on unseen skill retention datasets.
arXiv Detail & Related papers (2025-03-21T22:56:54Z)
Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data [45.75724873443564]
Domain shift poses a significant challenge in clinical applications, e.g., Diabetic Retinopathy grading.<n>We propose a novel approach, Generative Unadversarial ExampleS (GUES), which enables adaptation from a data-centric perspective.
arXiv Detail & Related papers (2024-12-02T07:14:25Z)
Fine-tuning -- a Transfer Learning approach [0.22344294014777952]
Missingness in Electronic Health Records (EHRs) is often hampered by the abundance of missing data in this valuable resource. Existing deep imputation methods rely on end-to-end pipelines that incorporate both imputation and downstream analyses. This paper explores the development of a modular, deep learning-based imputation and classification pipeline.
arXiv Detail & Related papers (2024-11-06T14:18:23Z)
Multi-stream deep learning framework to predict mild cognitive impairment with Rey Complex Figure Test [10.324611550865926]
We developed a multi-stream deep learning framework that integrates two distinct processing streams. The proposed multi-stream model demonstrated superior performance over baseline models in external validation. Our model has practical implications for clinical settings, where it could serve as a cost-effective tool for early screening.
arXiv Detail & Related papers (2024-09-04T17:08:04Z)
Improving Machine Learning Based Sepsis Diagnosis Using Heart Rate Variability [0.0]
This study aims to use heart rate variability (HRV) features to develop an effective predictive model for sepsis detection. A neural network model is trained on the HRV features, achieving an F1 score of 0.805, a precision of 0.851, and a recall of 0.763.
arXiv Detail & Related papers (2024-08-01T01:47:29Z)
Enhancing Cognitive Workload Classification Using Integrated LSTM Layers and CNNs for fNIRS Data Analysis [13.74551296919155]
This paper explores the im-pact of Long Short-Term Memory layers on the effectiveness of Convolutional Neural Networks (CNNs) within deep learning models. By integrating LSTM layers, the model can capture temporal dependencies in the fNIRS data, al-lowing for a more comprehensive understanding of cognitive states.
arXiv Detail & Related papers (2024-07-22T11:28:34Z)
Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z)
Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues. PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target. Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z)
Contrastive Conditional Neural Processes [45.70735205041254]
Conditional Neural Processes(CNPs) bridge neural networks with probabilistic inference to approximate functions of Processes under meta-learning settings. Two auxiliary contrastive branches are set up hierarchically, namely in-instantiation temporal contrastive learning(tt TCL) and cross-instantiation function contrastive learning(tt FCL) We empirically show that tt TCL captures high-level abstraction of observations, whereas tt FCL helps identify underlying functions, which in turn provides more efficient representations.
arXiv Detail & Related papers (2022-03-08T10:08:45Z)
Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery. We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks. We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z)
Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z)
TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks [43.95869213955351]
We propose a Multi-Stage Temporal Convolutional Network (MS-TCN) that performs hierarchical prediction refinement for surgical phase recognition. Our method is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos with and without the use of additional surgical tool information.
arXiv Detail & Related papers (2020-03-24T10:12:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.