One-shot action recognition towards novel assistive therapies
- URL: http://arxiv.org/abs/2102.08997v1
- Date: Wed, 17 Feb 2021 19:41:37 GMT
- Title: One-shot action recognition towards novel assistive therapies
- Authors: Alberto Sabater, Laura Santos, Jose Santos-Victor, Alexandre
Bernardino, Luis Montesano, Ana C. Murillo
- Abstract summary: This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
- Score: 63.23654147345168
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: One-shot action recognition is a challenging problem, especially when the
target video can contain one, more or none repetitions of the target action.
Solutions to this problem can be used in many real world applications that
require automated processing of activity videos. In particular, this work is
motivated by the automated analysis of medical therapies that involve action
imitation games. The presented approach incorporates a pre-processing step that
standardizes heterogeneous motion data conditions and generates descriptive
movement representations with a Temporal Convolutional Network for a final
one-shot (or few-shot) action recognition. Our method achieves state-of-the-art
results on the public NTU-120 one-shot action recognition challenge. Besides,
we evaluate the approach on a real use-case of automated video analysis for
therapy support with autistic people. The promising results prove its
suitability for this kind of application in the wild, providing both
quantitative and qualitative measures, essential for the patient evaluation and
monitoring.
Related papers
- A Comprehensive Review of Few-shot Action Recognition [64.47305887411275]
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data.
It requires accurately classifying human actions in videos using only a few labeled examples per class.
arXiv Detail & Related papers (2024-07-20T03:53:32Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition
in the Operating Room [6.132617753806978]
We propose a new sample-efficient and object-based approach for surgical activity recognition in the OR.
Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR.
arXiv Detail & Related papers (2023-12-19T15:33:57Z) - Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using
Smartwatches [5.132618393976799]
We aim to design an upper-limb assessment pipeline for stroke patients using smartwatches.
Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale.
arXiv Detail & Related papers (2022-12-09T14:00:49Z) - Automated Fidelity Assessment for Strategy Training in Inpatient
Rehabilitation using Natural Language Processing [53.096237570992294]
Strategy training is a rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke.
Standardized fidelity assessment is used to measure adherence to treatment principles.
We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task.
arXiv Detail & Related papers (2022-09-14T15:33:30Z) - E^2TAD: An Energy-Efficient Tracking-based Action Detector [78.90585878925545]
This paper presents a tracking-based solution to accurately and efficiently localize predefined key actions.
It won first place in the UAV-Video Track of 2021 Low-Power Computer Vision Challenge (LPCVC)
arXiv Detail & Related papers (2022-04-09T07:52:11Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Deep Homography Estimation in Dynamic Surgical Scenes for Laparoscopic
Camera Motion Extraction [6.56651216023737]
We introduce a method that allows to extract a laparoscope holder's actions from videos of laparoscopic interventions.
We synthetically add camera motion to a newly acquired dataset of camera motion free da Vinci surgery image sequences.
We find our method transfers from our camera motion free da Vinci surgery dataset to videos of laparoscopic interventions, outperforming classical homography estimation approaches in both, precision by 41%, and runtime on a CPU by 43%.
arXiv Detail & Related papers (2021-09-30T13:05:37Z) - Cross-Task Representation Learning for Anatomical Landmark Detection [20.079451546446712]
We propose to regularize the knowledge transfer across source and target tasks through cross-task representation learning.
The proposed method is demonstrated for extracting facial anatomical landmarks which facilitate the diagnosis of fetal alcohol syndrome.
We present two approaches for the proposed representation learning by constraining either final or intermediate model features on the target model.
arXiv Detail & Related papers (2020-09-28T21:22:49Z) - Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and
Progress Prediction [17.63619129438996]
We propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress.
We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.
arXiv Detail & Related papers (2020-03-10T14:28:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.