Leadership Assessment in Pediatric Intensive Care Unit Team Training
- URL: http://arxiv.org/abs/2505.24389v2
- Date: Thu, 28 Aug 2025 04:31:31 GMT
- Title: Leadership Assessment in Pediatric Intensive Care Unit Team Training
- Authors: Liangyang Ouyang, Yuki Sakai, Ryosuke Furuta, Hisataka Nozawa, Hikoro Matsui, Yoichi Sato,
- Abstract summary: This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision.<n>We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment.
- Score: 18.37408109860005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision. We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment. In order to capture these multimodal signals, we employ Aria Glasses to record egocentric video, audio, gaze, and head movement data. We collect one-hour videos of four simulated sessions involving doctors with different roles and levels. To automate data processing, we propose a method leveraging REMoDNaV, SAM, YOLO, and ChatGPT for fixation object detection, eye contact detection, and conversation classification. In the experiments, significant correlations are observed between leadership skills and behavioral metrics, i.e., the output of our proposed methods, such as fixation time, transition patterns, and direct orders in speech. These results indicate that our proposed data collection and analysis framework can effectively solve skill assessment for training PICU teams.
Related papers
- CLIP-Guided Adaptable Self-Supervised Learning for Human-Centric Visual Tasks [76.00315860962885]
We propose CLASP (CLIP-guided Adaptable Self-suPervised learning), a novel framework for unsupervised pre-training in human-centric visual tasks.<n> CLASP leverages the powerful vision-language model CLIP to generate both low-level (e.g., body parts) and high-level (e.g., attributes) semantic pseudo-labels.<n>MoE dynamically adapts feature extraction based on task-specific prompts, mitigating potential feature conflicts and enhancing transferability.
arXiv Detail & Related papers (2026-01-19T15:19:28Z) - Video-Based Performance Evaluation for ECR Drills in Synthetic Training Environments [1.6162271703130058]
This paper introduces a video-based assessment pipeline that derives performance analytics from training videos without requiring additional hardware.<n>We develop task-specific metrics that measure psychomotor fluency, situational awareness, and team coordination.<n>Future work includes expanding analysis to 3D video data and leveraging video analysis to enable scalable evaluation within STEs.
arXiv Detail & Related papers (2025-12-29T19:30:41Z) - Trainee Action Recognition through Interaction Analysis in CCATT Mixed-Reality Training [1.5641818606249476]
Critical Care Air Transport Team members must stabilize severely injured soldiers by managing ventilators, IV pumps, and suction devices during flight.<n>Recent advances in simulation and multimodal data analytics enable more objective and comprehensive performance evaluation.<n>This study examines how CCATT members are trained using mixed-reality simulations that replicate the high-pressure conditions of aeromedical evacuation.
arXiv Detail & Related papers (2025-09-22T15:19:45Z) - Dude, where's my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection [0.27309692684728604]
Collaborative Problem-Solving markers capture key aspects of effective teamwork.<n>An AI system that reliably detects these markers could help teachers identify when a group is struggling or demonstrating productive collaboration.<n>We evaluate how CPS detection is impacted by automating two critical components: transcription and speech segmentation.
arXiv Detail & Related papers (2025-07-06T16:25:18Z) - MOSAIC-F: A Framework for Enhancing Students' Oral Presentation Skills through Personalized Feedback [1.0835264351334324]
This framework integrates Multimodal Learning Analytics (MMLA), Observations, Sensors, Artificial Intelligence (AI), and Collaborative assessments.<n>By combining human-based and data-based evaluation techniques, this framework enables more accurate, personalized and actionable feedback.
arXiv Detail & Related papers (2025-06-10T09:46:31Z) - Playpen: An Environment for Exploring Learning Through Conversational Interaction [81.67330926729015]
We investigate whether Dialogue Games can also serve as a source of feedback signals for learning.<n>We introduce Playpen, an environment for off- and online learning through Dialogue Game self-play.<n>We find that imitation learning through SFT improves performance on unseen instances, but negatively impacts other skills.
arXiv Detail & Related papers (2025-04-11T14:49:33Z) - Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment [65.70317151363204]
This work introduces the first framework for reconstructing surgical dialogue from unstructured real-world recordings.<n>In surgical training, the formative verbal feedback that trainers provide to trainees during live surgeries is crucial for ensuring safety, correcting behavior immediately, and facilitating long-term skill acquisition.<n>Our framework integrates voice activity detection, speaker diarization, and automated speech recaognition, with a novel enhancement that removes hallucinations.
arXiv Detail & Related papers (2024-12-01T10:35:12Z) - WearableMil: An End-to-End Framework for Military Activity Recognition and Performance Monitoring [7.130450173185638]
This paper introduces an end-to-end framework for preprocessing, analyzing, and recognizing activities from wearable data in military training contexts.<n>We use data from 135 soldiers wearing textitGarmin--55 smartwatches over six months with over 15 million minutes.<n>Our framework addresses missing data through physiologically-informed methods, reducing unknown sleep states from 40.38% to 3.66%.
arXiv Detail & Related papers (2024-10-07T19:35:15Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - Revisiting Self-supervised Learning of Speech Representation from a
Mutual Information Perspective [68.20531518525273]
We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective.
We use linear probes to estimate the mutual information between the target information and learned representations.
We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
arXiv Detail & Related papers (2024-01-16T21:13:22Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - MATT: Multimodal Attention Level Estimation for e-learning Platforms [16.407885871027887]
This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis.
Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load.
The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment.
arXiv Detail & Related papers (2023-01-22T18:18:20Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Challenges and Opportunities for Machine Learning Classification of
Behavior and Mental State from Images [3.7445390865272588]
Computer Vision (CV) classifiers distinguish and detect nonverbal social human behavior and mental state.
There are several pain points which arise when attempting this process for behavioral phenotyping.
We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning.
arXiv Detail & Related papers (2022-01-26T21:35:17Z) - Comparison of Speaker Role Recognition and Speaker Enrollment Protocol
for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric.
Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.