Quantitative Gait Analysis from Single RGB Videos Using a Dual-Input Transformer-Based Network
- URL: http://arxiv.org/abs/2501.01689v1
- Date: Fri, 03 Jan 2025 08:10:08 GMT
- Title: Quantitative Gait Analysis from Single RGB Videos Using a Dual-Input Transformer-Based Network
- Authors: Hiep Dinh, Son Le, My Than, Minh Ho, Nicolas Vuillerme, Hieu Pham,
- Abstract summary: We present an efficient approach for clinical gait analysis through a dual-pattern input convolutional Transformer network.
The system demonstrates high accuracy in estimating critical metrics such as the gait deviation index (GDI), knee flexion angle, step length, and walking cadence.
- Score: 8.868801767577846
- License:
- Abstract: Gait and movement analysis have become a well-established clinical tool for diagnosing health conditions, monitoring disease progression for a wide spectrum of diseases, and to implement and assess treatment, surgery and or rehabilitation interventions. However, quantitative motion assessment remains limited to costly motion capture systems and specialized personnel, restricting its accessibility and broader application. Recent advancements in deep neural networks have enabled quantitative movement analysis using single-camera videos, offering an accessible alternative to conventional motion capture systems. In this paper, we present an efficient approach for clinical gait analysis through a dual-pattern input convolutional Transformer network. The proposed system leverages a dual-input Transformer model to estimate essential gait parameters from single RGB videos captured by a single-view camera. The system demonstrates high accuracy in estimating critical metrics such as the gait deviation index (GDI), knee flexion angle, step length, and walking cadence, validated on a dataset of individuals with movement disorders. Notably, our approach surpasses state-of-the-art methods in various scenarios, using fewer resources and proving highly suitable for clinical application, particularly in resource-constrained environments.
Related papers
- Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation [4.975410989590524]
Gait analysis using computer vision is an emerging field in AI, offering clinicians an objective, multi-feature approach to analyse complex movements.
This paper presents a data-driven method using RGB video data and 2D human pose estimation for developing normative kinematic gait parameters.
arXiv Detail & Related papers (2024-11-20T21:27:13Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Learning to Estimate Critical Gait Parameters from Single-View RGB
Videos with Transformer-Based Attention Network [0.0]
This paper introduces a novel Transformer network to estimate critical gait parameters from RGB videos captured by a single-view camera.
Empirical evaluations on a public dataset of cerebral palsy patients indicate that the proposed framework surpasses current state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-01T07:45:27Z) - Robotic Navigation Autonomy for Subretinal Injection via Intelligent
Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection.
Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target.
Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Occlusion-robust Visual Markerless Bone Tracking for Computer-Assisted
Orthopaedic Surgery [41.681134859412246]
We propose a RGB-D sensing-based markerless tracking method that is robust against occlusion.
By using a high-quality commercial RGB-D camera, our proposed visual tracking method achieves an accuracy of 1-2 degress and 2-4 mm on a model knee.
arXiv Detail & Related papers (2021-08-24T09:49:08Z) - One to Many: Adaptive Instrument Segmentation via Meta Learning and
Dynamic Online Adaptation in Robotic Surgical Video [71.43912903508765]
MDAL is a dynamic online adaptive learning scheme for instrument segmentation in robot-assisted surgery.
It learns the general knowledge of instruments and the fast adaptation ability through the video-specific meta-learning paradigm.
It outperforms other state-of-the-art methods on two datasets.
arXiv Detail & Related papers (2021-03-24T05:02:18Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - Multi-view Human Pose and Shape Estimation Using Learnable Volumetric
Aggregation [0.0]
We propose a learnable aggregation approach to reconstruct 3D human body pose and shape from calibrated multi-view images.
Compared to previous approaches, our framework shows higher accuracy and greater promise for real-time prediction, given its cost efficiency.
arXiv Detail & Related papers (2020-11-26T18:33:35Z) - A Single RGB Camera Based Gait Analysis with a Mobile Tele-Robot for
Healthcare [9.992387025633805]
This work focuses on the analysis of gait, which is widely adopted for joint correction and assessing any lower limb or spinal problem.
On the hardware side, we design a novel marker-less gait analysis device using a low-cost RGB camera mounted on a mobile tele-robot.
arXiv Detail & Related papers (2020-02-11T21:42:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.