Related papers: Real-Time Multimodal Cognitive Assistant for Emergency Medical Services

Real-Time Multimodal Cognitive Assistant for Emergency Medical Services

URL: http://arxiv.org/abs/2403.06734v1
Date: Mon, 11 Mar 2024 13:56:57 GMT
Title: Real-Time Multimodal Cognitive Assistant for Emergency Medical Services
Authors: Keshara Weerasinghe, Saahith Janapati, Xueren Ge, Sion Kim, Sneha Iyer, John A. Stankovic, Homa Alemzadeh
Abstract summary: This paper presents CognitiveEMS, an end-to-end wearable cognitive assistant system. It can act as a collaborative virtual partner engaging in the real-time acquisition and analysis of multimodal data from an emergency scene.
Score: 4.669165383466683
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Emergency Medical Services (EMS) responders often operate under time-sensitive conditions, facing cognitive overload and inherent risks, requiring essential skills in critical thinking and rapid decision-making. This paper presents CognitiveEMS, an end-to-end wearable cognitive assistant system that can act as a collaborative virtual partner engaging in the real-time acquisition and analysis of multimodal data from an emergency scene and interacting with EMS responders through Augmented Reality (AR) smart glasses. CognitiveEMS processes the continuous streams of data in real-time and leverages edge computing to provide assistance in EMS protocol selection and intervention recognition. We address key technical challenges in real-time cognitive assistance by introducing three novel components: (i) a Speech Recognition model that is fine-tuned for real-world medical emergency conversations using simulated EMS audio recordings, augmented with synthetic data generated by large language models (LLMs); (ii) an EMS Protocol Prediction model that combines state-of-the-art (SOTA) tiny language models with EMS domain knowledge using graph-based attention mechanisms; (iii) an EMS Action Recognition module which leverages multimodal audio and video data and protocol predictions to infer the intervention/treatment actions taken by the responders at the incident scene. Our results show that for speech recognition we achieve superior performance compared to SOTA (WER of 0.290 vs. 0.618) on conversational data. Our protocol prediction component also significantly outperforms SOTA (top-3 accuracy of 0.800 vs. 0.200) and the action recognition achieves an accuracy of 0.727, while maintaining an end-to-end latency of 3.78s for protocol prediction on the edge and 0.31s on the server.

Related papers

Holistic Artificial Intelligence in Medicine; improved performance and explainability [4.862319939462255]
xHAIM (Explainable HAIM) is a novel framework leveraging Generative AI to enhance both prediction and explainability.<n>xHAIM improves average AUC from 79.9% to 90.3% across chest pathology and operative tasks.<n>It transforms AI from a black-box predictor into an explainable decision support system, enabling clinicians to interactively trace predictions back to relevant patient data.
arXiv Detail & Related papers (2025-06-30T19:15:06Z)
Emotion Detection on User Front-Facing App Interfaces for Enhanced Schedule Optimization: A Machine Learning Approach [0.0]
We present and evaluate two complementary approaches to emotion detection.<n>A biometric-based method utilizing heart rate (HR) data extracted from electrocardiogram (ECG) signals to predict the emotional dimensions of Valence, Arousal, and Dominance; and a behavioral method analyzing computer activity through multiple machine learning models to classify emotions based on fine-grained user interactions such as mouse movements, clicks, and keystroke patterns.<n>Our comparative analysis, from real-world datasets, reveals that while both approaches demonstrate effectiveness, the computer activity-based method delivers superior consistency and accuracy, particularly for mouse-related interactions, which achieved approximately
arXiv Detail & Related papers (2025-06-24T03:21:46Z)
Towards user-centered interactive medical image segmentation in VR with an assistive AI agent [0.5578116134031106]
We propose SAMIRA, a novel conversational AI agent for medical VR that assists users with localizing, segmenting, and visualizing 3D medical concepts.<n>The system also supports true-to-scale 3D visualization of segmented pathology to enhance patient-specific anatomical understanding.<n>With a user study, evaluations demonstrated a high usability score (SUS=90.0 $pm$ 9.0), low overall task load, and strong support for the proposed VR system's guidance.
arXiv Detail & Related papers (2025-05-12T03:47:05Z)
MELON: Multimodal Mixture-of-Experts with Spectral-Temporal Fusion for Long-Term Mobility Estimation in Critical Care [1.5237145555729716]
We introduce MELON, a novel framework designed to predict 12-hour mobility status in the critical care setting. We trained and evaluated the MELON model on the multimodal dataset of 126 patients recruited from nine Intensive Care Units at the University of Florida Health Shands Hospital main campus in Gainesville, Florida. Results showed that MELON outperforms conventional approaches for 12-hour mobility status estimation.
arXiv Detail & Related papers (2025-03-10T19:47:46Z)
IoT-Based Real-Time Medical-Related Human Activity Recognition Using Skeletons and Multi-Stage Deep Learning for Healthcare [1.5236380958983642]
The Internet of Things (IoT) and mobile technology have significantly transformed healthcare by enabling real-time monitoring and diagnosis of patients. Human Motion Recognition (HMR) challenges such as high computational demands, low accuracy, and limited adaptability persist. This study proposes a novel HMR method for MRHA detection, leveraging multi-stage deep learning techniques integrated with IoT.
arXiv Detail & Related papers (2025-01-13T03:41:57Z)
MANGO: Multimodal Acuity traNsformer for intelliGent ICU Outcomes [11.385654412265461]
We present MANGO: the Multimodal Acuity traNsformer for intelliGent ICU outcomes. It is designed to enhance the prediction of patient acuity states, transitions, and the need for life-sustaining therapy.
arXiv Detail & Related papers (2024-12-13T23:51:15Z)
High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR [1.3810901729134184]
We introduce United-MedASR, a novel architecture that addresses challenges by integrating synthetic data generation, precision ASR fine-tuning, and semantic enhancement techniques. United-MedASR constructs a specialised medical vocabulary by synthesising data from authoritative sources such as ICD-10, MIMS, and FDA databases. To enhance processing speed, we incorporate Faster Whisper, ensuring streamlined and high-speed ASR performance.
arXiv Detail & Related papers (2024-11-24T17:02:48Z)
Early Recognition of Parkinson's Disease Through Acoustic Analysis and Machine Learning [0.0]
Parkinson's Disease (PD) is a progressive neurodegenerative disorder that significantly impacts both motor and non-motor functions, including speech. This paper provides a comprehensive review of methods for PD recognition using speech data, highlighting advances in machine learning and data-driven approaches. Various classification algorithms are explored, including logistic regression, SVM, and neural networks, with and without feature selection. Our findings indicate that specific acoustic features and advanced machine-learning techniques can effectively differentiate between individuals with PD and healthy controls.
arXiv Detail & Related papers (2024-07-22T23:24:02Z)
Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process [15.562905335917408]
Existing mechanical simulators for MCS rely on oversimplifying assumptions and are insensitive to patient-specific behavior. We use a neural process architecture to capture the probabilistic relationship between MCS pump levels and aortic pressure measurements with uncertainty. Empirical results with an improvement of 19% in non-stationary trend prediction establish DANP as an effective tool for clinicians.
arXiv Detail & Related papers (2024-05-28T19:07:12Z)
Analyzing Participants' Engagement during Online Meetings Using Unsupervised Remote Photoplethysmography with Behavioral Features [50.82725748981231]
Engagement measurement finds application in healthcare, education, services. Use of physiological and behavioral features is viable, but impracticality of traditional physiological measurement arises due to the need for contact sensors. We demonstrate the feasibility of the unsupervised photoplethysmography (rmography) as an alternative for contact sensors.
arXiv Detail & Related papers (2024-04-05T20:39:16Z)
Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support* Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit. Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z)
Foresight -- Deep Generative Modelling of Patient Timelines using Electronic Health Records [46.024501445093755]
Temporal modelling of medical history can be used to forecast and simulate future events, estimate risk, suggest alternative diagnoses or forecast complications. We present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e. MedCAT) to convert document text into structured, coded concepts.
arXiv Detail & Related papers (2022-12-13T19:06:00Z)
MIMO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z)
Relational Graph Learning on Visual and Kinematics Embeddings for Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information. The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response. We formalize the prognosis modeling as a multi-modal asynchronous time series classification task. Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)
BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.