Related papers: Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder

Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder

URL: http://arxiv.org/abs/2404.05849v1
Date: Mon, 8 Apr 2024 20:31:27 GMT
Title: Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder
Authors: Halil Ismail Helvaci, Sen-ching Samson Cheung, Chen-Nee Chuah, Sally Ozonoff,
Abstract summary: We introduce a self-attention based TAL model designed to identify ASD-related behaviors in infant videos. This study is the first to conduct end-to-end temporal action localization in untrimmed videos of infants with ASD. We achieve 70% accuracy for look face, 79% accuracy for look object, 72% for smile and 65% for vocalization.
Score: 5.2289135066938375
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autism Spectrum Disorder (ASD) presents significant challenges in early diagnosis and intervention, impacting children and their families. With prevalence rates rising, there is a critical need for accessible and efficient screening tools. Leveraging machine learning (ML) techniques, in particular Temporal Action Localization (TAL), holds promise for automating ASD screening. This paper introduces a self-attention based TAL model designed to identify ASD-related behaviors in infant videos. Unlike existing methods, our approach simplifies complex modeling and emphasizes efficiency, which is essential for practical deployment in real-world scenarios. Importantly, this work underscores the importance of developing computer vision methods capable of operating in naturilistic environments with little equipment control, addressing key challenges in ASD screening. This study is the first to conduct end-to-end temporal action localization in untrimmed videos of infants with ASD, offering promising avenues for early intervention and support. We report baseline results of behavior detection using our TAL model. We achieve 70% accuracy for look face, 79% accuracy for look object, 72% for smile and 65% for vocalization.

Related papers

Script-centric behavior understanding for assisted autism spectrum disorder diagnosis [6.198128116862245]
This work focuses on automatically detecting Autism Spectrum Disorders (ASD) using computer vision techniques and large language models (LLMs) Our pipeline converts video content into scripts that describe the behavior of characters, leveraging the generalizability of large language models to detect ASD in a zero-shot or few-shot manner. Our method achieves an accuracy of 92.00% in diagnosing ASD in children with an average age of 24 months, surpassing the performance of supervised learning methods by 3.58% absolutely.
arXiv Detail & Related papers (2024-11-14T13:07:19Z)
Enhancing Autism Spectrum Disorder Early Detection with the Parent-Child Dyads Block-Play Protocol and an Attention-enhanced GCN-xLSTM Hybrid Deep Learning Framework [6.785167067600156]
This work proposes a novel Parent-Child Dyads Block-Play (PCB) protocol to identify behavioral patterns distinguishing ASD from typically developing toddlers. We have compiled a substantial video dataset, featuring 40 ASD and 89 TD toddlers engaged in block play with parents. This dataset exceeds previous efforts on both the scale of participants and the length of individual sessions.
arXiv Detail & Related papers (2024-08-29T21:53:01Z)
Ensemble Modeling of Multiple Physical Indicators to Dynamically Phenotype Autism Spectrum Disorder [3.6630139570443996]
We provide a dataset for training computer vision models to detect Autism Spectrum Disorder (ASD)-related phenotypic markers. We trained individual LSTM-based models using eye gaze, head positions, and facial landmarks as input features, achieving test AUCs of 86%, 67%, and 78%.
arXiv Detail & Related papers (2024-08-23T17:55:58Z)
Video-Based Autism Detection with Deep Learning [0.0]
We develop a deep learning model that analyzes video clips of children reacting to sensory stimuli. Results show that our model effectively generalizes and understands key differences in the distinct movements of the children.
arXiv Detail & Related papers (2024-02-26T17:45:00Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
Language-Assisted Deep Learning for Autistic Behaviors Recognition [13.200025637384897]
We show that a vision-based problem behaviors recognition system can achieve high accuracy and outperform the previous methods by a large margin. We propose a two-branch multimodal deep learning framework by incorporating the "freely available" language description for each type of problem behavior. Experimental results demonstrate that incorporating additional language supervision can bring an obvious performance boost for the autism problem behaviors recognition task.
arXiv Detail & Related papers (2022-11-17T02:58:55Z)
Vision-Based Activity Recognition in Children with Autism-Related Behaviors [15.915410623440874]
We demonstrate the effect of a region-based computer vision system to help clinicians and parents analyze a child's behavior. The data is pre-processed by detecting the target child in the video to reduce the impact of background noise. Motivated by the effectiveness of temporal convolutional models, we propose both light-weight and conventional models capable of extracting action features from video frames.
arXiv Detail & Related papers (2022-08-08T15:12:27Z)
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community. The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z)
A Spatio-temporal Attention-based Model for Infant Movement Assessment from Videos [44.71923220732036]
We develop a new method for fidgety movement assessment using human poses extracted from short clips. Human poses capture only relevant motion profiles of joints and limbs and are free from irrelevant appearance artifacts. Our experiments show that the proposed method achieves the ROC-AUC score of 81.87%, significantly outperforming existing competing methods with better interpretability.
arXiv Detail & Related papers (2021-05-20T14:31:54Z)
One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games. The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions. We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z)
Early Autism Spectrum Disorders Diagnosis Using Eye-Tracking Technology [62.997667081978825]
Lack of money, absence of qualified specialists, and low level of trust to the correction methods are the main issues that affect the in-time diagnoses of ASD. Our team developed the algorithm that will be able to predict the chances of ASD according to the information from the gaze activity of the child.
arXiv Detail & Related papers (2020-08-21T20:22:55Z)
A Smartphone-based System for Real-time Early Childhood Caries Diagnosis [76.71303610807156]
Early childhood caries (ECC) is the most common, yet preventable chronic disease in children under the age of 6. In this study, we propose a multistage deep learning-based system for cavity detection. We integrate the deep learning system into an easy-to-use mobile application that can diagnose ECC from an early stage and provide real-time results to untrained users.
arXiv Detail & Related papers (2020-08-17T21:11:19Z)
Detecting Parkinsonian Tremor from IMU Data Collected In-The-Wild using Deep Multiple-Instance Learning [59.74684475991192]
Parkinson's Disease (PD) is a slowly evolving neuro-logical disease that affects about 1% of the population above 60 years old. PD symptoms include tremor, rigidity and braykinesia. We present a method for automatically identifying tremorous episodes related to PD, based on IMU signals captured via a smartphone device.
arXiv Detail & Related papers (2020-05-06T09:02:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.