Vision-language models for decoding provider attention during neonatal resuscitation
- URL: http://arxiv.org/abs/2404.01207v1
- Date: Mon, 1 Apr 2024 16:09:12 GMT
- Title: Vision-language models for decoding provider attention during neonatal resuscitation
- Authors: Felipe Parodi, Jordan Matelsky, Alejandra Regla-Vargas, Elizabeth Foglia, Charis Lim, Danielle Weinberg, Konrad Kording, Heidi Herrick, Michael Platt,
- Abstract summary: We introduce an automated, real-time, deep learning approach capable of decoding provider gaze into semantic classes.
Our pipeline attains 91% classification accuracy in identifying gaze targets without training.
Our approach offers a scalable solution that seamlessly integrates with existing infrastructure for data-scarce gaze analysis.
- Score: 33.7054351451505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neonatal resuscitations demand an exceptional level of attentiveness from providers, who must process multiple streams of information simultaneously. Gaze strongly influences decision making; thus, understanding where a provider is looking during neonatal resuscitations could inform provider training, enhance real-time decision support, and improve the design of delivery rooms and neonatal intensive care units (NICUs). Current approaches to quantifying neonatal providers' gaze rely on manual coding or simulations, which limit scalability and utility. Here, we introduce an automated, real-time, deep learning approach capable of decoding provider gaze into semantic classes directly from first-person point-of-view videos recorded during live resuscitations. Combining state-of-the-art, real-time segmentation with vision-language models (CLIP), our low-shot pipeline attains 91\% classification accuracy in identifying gaze targets without training. Upon fine-tuning, the performance of our gaze-guided vision transformer exceeds 98\% accuracy in gaze classification, approaching human-level precision. This system, capable of real-time inference, enables objective quantification of provider attention dynamics during live neonatal resuscitation. Our approach offers a scalable solution that seamlessly integrates with existing infrastructure for data-scarce gaze analysis, thereby offering new opportunities for understanding and refining clinical decision making.
Related papers
- Using Explainable AI for EEG-based Reduced Montage Neonatal Seizure Detection [2.206534289238751]
The gold-standard for neonatal seizure detection currently relies on continuous video-EEG monitoring.
A novel explainable deep learning model to automate the neonatal seizure detection process with a reduced EEG montage is proposed.
The presented model achieves an absolute improvement of 8.31% and 42.86% in area under curve (AUC) and recall, respectively.
arXiv Detail & Related papers (2024-06-04T10:53:56Z) - Dynamic Gaussian Splatting from Markerless Motion Capture can
Reconstruct Infants Movements [2.44755919161855]
This work paves the way for advanced movement analysis tools that can be applied to diverse clinical populations.
We explored the application of dynamic Gaussian splatting to sparse markerless motion capture data.
Our results demonstrate the potential of this method in rendering novel views of scenes and tracking infant movements.
arXiv Detail & Related papers (2023-10-30T11:09:39Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Evaluation of self-supervised pre-training for automatic infant movement
classification using wearable movement sensors [2.995873287514728]
The infant wearable MAIJU provides a means to automatically evaluate infants' motor performance in out-of-hospital settings.
We investigated how self-supervised pre-training improves performance of the classifiers used for analyzing MAIJU recordings.
arXiv Detail & Related papers (2023-05-16T11:46:16Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - BabyNet: A Lightweight Network for Infant Reaching Action Recognition in
Unconstrained Environments to Support Future Pediatric Rehabilitation
Applications [5.4771139749266435]
Action recognition is an important component to improve autonomy of physical rehabilitation devices, such as wearable robotic exoskeletons.
In this paper, we introduce BabyNet, a light-weight (in terms of trainable parameters) network structure to recognize infant reaching action from off-body stationary cameras.
arXiv Detail & Related papers (2022-08-09T07:38:36Z) - Automated Classification of General Movements in Infants Using a
Two-stream Spatiotemporal Fusion Network [5.541644538483947]
The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders.
Recent video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information.
We propose an automated GMs classification method, which consists of preprocessing networks that remove unnecessary background information.
arXiv Detail & Related papers (2022-07-04T05:21:09Z) - Leveraging Human Selective Attention for Medical Image Analysis with
Limited Training Data [72.1187887376849]
The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors.
We propose a framework to leverage gaze for medical image analysis tasks with small training data.
Our method is demonstrated to achieve superior performance on both 3D tumor segmentation and 2D chest X-ray classification tasks.
arXiv Detail & Related papers (2021-12-02T07:55:25Z) - Explaining Clinical Decision Support Systems in Medical Imaging using
Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest.
clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend.
We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.