PoseAction: Action Recognition for Patients in the Ward using Deep
Learning Approaches
- URL: http://arxiv.org/abs/2310.03288v1
- Date: Thu, 5 Oct 2023 03:33:35 GMT
- Title: PoseAction: Action Recognition for Patients in the Ward using Deep
Learning Approaches
- Authors: Zherui Li and Raye Chen-Hua Yeow
- Abstract summary: We propose using computer vision (CV) and deep learning (DL) methods for detecting subjects and recognizing their actions.
We utilize OpenPose as an accurate subject detector for recognizing the positions of human subjects in the video stream.
We employ AlphAction's Asynchronous Interaction Aggregation (AIA) network to predict the actions of detected subjects.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Real-time intelligent detection and prediction of subjects' behavior
particularly their movements or actions is critical in the ward. This approach
offers the advantage of reducing in-hospital care costs and improving the
efficiency of healthcare workers, which is especially true for scenarios at
night or during peak admission periods. Therefore, in this work, we propose
using computer vision (CV) and deep learning (DL) methods for detecting
subjects and recognizing their actions. We utilize OpenPose as an accurate
subject detector for recognizing the positions of human subjects in the video
stream. Additionally, we employ AlphAction's Asynchronous Interaction
Aggregation (AIA) network to predict the actions of detected subjects. This
integrated model, referred to as PoseAction, is proposed. At the same time, the
proposed model is further trained to predict 12 common actions in ward areas,
such as staggering, chest pain, and falling down, using medical-related video
clips from the NTU RGB+D and NTU RGB+D 120 datasets. The results demonstrate
that PoseAction achieves the highest classification mAP of 98.72% (IoU@0.5).
Additionally, this study develops an online real-time mode for action
recognition, which strongly supports the clinical translation of PoseAction.
Furthermore, using OpenPose's function for recognizing face key points, we also
implement face blurring, which is a practical solution to address the privacy
protection concerns of patients and healthcare workers. Nevertheless, the
training data for PoseAction is currently limited, particularly in terms of
label diversity. Consequently, the subsequent step involves utilizing a more
diverse dataset (including general actions) to train the model's parameters for
improved generalization.
Related papers
- HabitAction: A Video Dataset for Human Habitual Behavior Recognition [3.7478789114676108]
Human habitual behaviors (HHBs) hold significant importance for analyzing a person's personality, habits, and psychological changes.
In this work, we build a novel video dataset to demonstrate various HHBs.
The dataset contains 30 categories of habitual behaviors including more than 300,000 frames and 6,899 action instances.
arXiv Detail & Related papers (2024-08-24T04:40:31Z) - SMART: Scene-motion-aware human action recognition framework for mental disorder group [16.60713558596286]
We propose to build a vision-based Human Action Recognition dataset including abnormal actions often occurring in the mental disorder group.
We then introduce a novel Scene-Motion-aware Action Recognition framework, named SMART, consisting of two technical modules.
The effectiveness of our proposed method has been validated on our self-collected HAR dataset (HAD), achieving 94.9% and 93.1% accuracy in un-seen subjects and scenes, and outperforming state-of-the-art approaches by 6.5% and 13.2%, respectively.
arXiv Detail & Related papers (2024-06-07T05:29:42Z) - Position and Orientation-Aware One-Shot Learning for Medical Action
Recognition from Signal Data [9.757753196253532]
We propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data.
The proposed framework comprises two stages and each stage includes signal-level image generation ( SIG), cross-attention (CsA), dynamic time warping (DTW) modules.
arXiv Detail & Related papers (2023-09-27T13:08:15Z) - Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based
Action Recognition [33.68311764817763]
We propose an Actionlet-Dependent Contrastive Learning method (ActCLR)
The actionlet, defined as the discriminative subset of the human skeleton, effectively decomposes motion regions for better action modeling.
Different data transformations are applied to actionlet and non-actionlet regions to introduce more diversity while maintaining their own characteristics.
arXiv Detail & Related papers (2023-03-20T06:47:59Z) - DisenHCN: Disentangled Hypergraph Convolutional Networks for
Spatiotemporal Activity Prediction [53.76601630407521]
We propose a hypergraph network model called DisenHCN to bridge the gaps in existing solutions.
In particular, we first unify fine-grained user similarity and the complex matching between user preferences andtemporal activity into a heterogeneous hypergraph.
We then disentangle the user representations into different aspects (location-aware, time-aware, and activity-aware) and aggregate corresponding aspect's features on the constructed hypergraph.
arXiv Detail & Related papers (2022-08-14T06:51:54Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - Spatial-Temporal Alignment Network for Action Recognition and Detection [80.19235282200697]
This paper studies how to introduce viewpoint-invariant feature representations that can help action recognition and detection.
We propose a novel Spatial-Temporal Alignment Network (STAN) that aims to learn geometric invariant representations for action recognition and action detection.
We test our STAN model extensively on AVA, Kinetics-400, AVA-Kinetics, Charades, and Charades-Ego datasets.
arXiv Detail & Related papers (2020-12-04T06:23:40Z) - Attention-Oriented Action Recognition for Real-Time Human-Robot
Interaction [11.285529781751984]
We propose an attention-oriented multi-level network framework to meet the need for real-time interaction.
Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution.
The other compact CNN receives the extracted skeleton sequence as input for action recognition.
arXiv Detail & Related papers (2020-07-02T12:41:28Z) - Intra- and Inter-Action Understanding via Temporal Action Parsing [118.32912239230272]
We construct a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top.
Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition.
We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them.
arXiv Detail & Related papers (2020-05-20T17:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.