Leadership Assessment in Pediatric Intensive Care Unit Team Training
- URL: http://arxiv.org/abs/2505.24389v1
- Date: Fri, 30 May 2025 09:19:33 GMT
- Title: Leadership Assessment in Pediatric Intensive Care Unit Team Training
- Authors: Liangyang Ouyang, Yuki Sakai, Ryosuke Furuta, Hisataka Nozawa, Hikoro Matsui, Yoichi Sato,
- Abstract summary: This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision.<n>We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment.
- Score: 12.775569777482566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision. We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment. In order to capture these multimodal signals, we employ Aria Glasses to record egocentric video, audio, gaze, and head movement data. We collect one-hour videos of four simulated sessions involving doctors with different roles and levels. To automate data processing, we propose a method leveraging REMoDNaV, SAM, YOLO, and ChatGPT for fixation object detection, eye contact detection, and conversation classification. In the experiments, significant correlations are observed between leadership skills and behavioral metrics, i.e., the output of our proposed methods, such as fixation time, transition patterns, and direct orders in speech. These results indicate that our proposed data collection and analysis framework can effectively solve skill assessment for training PICU teams.
Related papers
- Dude, where's my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection [0.27309692684728604]
Collaborative Problem-Solving markers capture key aspects of effective teamwork.<n>An AI system that reliably detects these markers could help teachers identify when a group is struggling or demonstrating productive collaboration.<n>We evaluate how CPS detection is impacted by automating two critical components: transcription and speech segmentation.
arXiv Detail & Related papers (2025-07-06T16:25:18Z) - MOSAIC-F: A Framework for Enhancing Students' Oral Presentation Skills through Personalized Feedback [1.0835264351334324]
This framework integrates Multimodal Learning Analytics (MMLA), Observations, Sensors, Artificial Intelligence (AI), and Collaborative assessments.<n>By combining human-based and data-based evaluation techniques, this framework enables more accurate, personalized and actionable feedback.
arXiv Detail & Related papers (2025-06-10T09:46:31Z) - Playpen: An Environment for Exploring Learning Through Conversational Interaction [81.67330926729015]
We investigate whether Dialogue Games can also serve as a source of feedback signals for learning.<n>We introduce Playpen, an environment for off- and online learning through Dialogue Game self-play.<n>We find that imitation learning through SFT improves performance on unseen instances, but negatively impacts other skills.
arXiv Detail & Related papers (2025-04-11T14:49:33Z) - Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment [65.70317151363204]
This work introduces the first framework for reconstructing surgical dialogue from unstructured real-world recordings.<n>In surgical training, the formative verbal feedback that trainers provide to trainees during live surgeries is crucial for ensuring safety, correcting behavior immediately, and facilitating long-term skill acquisition.<n>Our framework integrates voice activity detection, speaker diarization, and automated speech recaognition, with a novel enhancement that removes hallucinations.
arXiv Detail & Related papers (2024-12-01T10:35:12Z) - WearableMil: An End-to-End Framework for Military Activity Recognition and Performance Monitoring [7.130450173185638]
This paper introduces an end-to-end framework for preprocessing, analyzing, and recognizing activities from wearable data in military training contexts.<n>We use data from 135 soldiers wearing textitGarmin--55 smartwatches over six months with over 15 million minutes.<n>Our framework addresses missing data through physiologically-informed methods, reducing unknown sleep states from 40.38% to 3.66%.
arXiv Detail & Related papers (2024-10-07T19:35:15Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - Revisiting Self-supervised Learning of Speech Representation from a
Mutual Information Perspective [68.20531518525273]
We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective.
We use linear probes to estimate the mutual information between the target information and learned representations.
We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
arXiv Detail & Related papers (2024-01-16T21:13:22Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - MATT: Multimodal Attention Level Estimation for e-learning Platforms [16.407885871027887]
This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis.
Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load.
The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment.
arXiv Detail & Related papers (2023-01-22T18:18:20Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Challenges and Opportunities for Machine Learning Classification of
Behavior and Mental State from Images [3.7445390865272588]
Computer Vision (CV) classifiers distinguish and detect nonverbal social human behavior and mental state.
There are several pain points which arise when attempting this process for behavioral phenotyping.
We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning.
arXiv Detail & Related papers (2022-01-26T21:35:17Z) - Comparison of Speaker Role Recognition and Speaker Enrollment Protocol
for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric.
Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.