Analysis and Evaluation of Kinect-based Action Recognition Algorithms
- URL: http://arxiv.org/abs/2112.08626v3
- Date: Fri, 7 Jun 2024 09:07:32 GMT
- Title: Analysis and Evaluation of Kinect-based Action Recognition Algorithms
- Authors: Lei Wang,
- Abstract summary: We implement and improve the HDG algorithm, and applied it in cross-view action recognition using the UWA3D Multiview Activity dataset.
The experimental results show that our improvement of HDG outperforms other three state-of-the-art algorithms for cross-view action recognition.
- Score: 2.7064617166078087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human action recognition still exists many challenging problems such as different viewpoints, occlusion, lighting conditions, human body size and the speed of action execution, although it has been widely used in different areas. To tackle these challenges, the Kinect depth sensor has been developed to record real time depth sequences, which are insensitive to the color of human clothes and illumination conditions. Many methods on recognizing human action have been reported in the literature such as HON4D, HOPC, RBD and HDG, which use the 4D surface normals, pointclouds, skeleton-based model and depth gradients respectively to capture discriminative information from depth videos or skeleton data. In this research project, the performance of four aforementioned algorithms will be analyzed and evaluated using five benchmark datasets, which cover challenging issues such as noise, change of viewpoints, background clutters and occlusions. We also implemented and improved the HDG algorithm, and applied it in cross-view action recognition using the UWA3D Multiview Activity dataset. Moreover, we used different combinations of individual feature vectors in HDG for performance evaluation. The experimental results show that our improvement of HDG outperforms other three state-of-the-art algorithms for cross-view action recognition.
Related papers
- Perceptual Quality Assessment of 3D Gaussian Splatting: A Subjective Dataset and Prediction Metric [76.66966098297986]
We present 3DGS-QA, the first subjective quality assessment dataset for 3DGS.<n>It comprises 225 degraded reconstructions across 15 object types, enabling a controlled investigation of common distortion factors.<n>Our model extracts spatial and photometric cues from the Gaussian representation to estimate perceived quality in a structure-aware manner.
arXiv Detail & Related papers (2025-11-11T09:34:20Z) - Human Action Recognition from Point Clouds over Time [0.6345523830122167]
This paper presents a novel approach for recognizing actions from 3D videos by introducing a pipeline that segments human point clouds from the background of a scene.<n>The method supports point clouds from both depth sensors and monocular depth estimation.<n>Experiments incorporate auxiliary point features including surface normals, color, infrared intensity, and body part parsing labels, to enhance recognition accuracy.
arXiv Detail & Related papers (2025-10-07T01:51:27Z) - Research on Image Recognition Technology Based on Multimodal Deep Learning [24.259653149898167]
This project investigates the human multi-modal behavior identification algorithm utilizing deep neural networks.
The performance of the suggested algorithm was evaluated using the MSR3D data set.
arXiv Detail & Related papers (2024-05-06T01:05:21Z) - Two Approaches to Supervised Image Segmentation [55.616364225463066]
The present work develops comparison experiments between deep learning and multiset neurons approaches.
The deep learning approach confirmed its potential for performing image segmentation.
The alternative multiset methodology allowed for enhanced accuracy while requiring little computational resources.
arXiv Detail & Related papers (2023-07-19T16:42:52Z) - Neural Point-based Volumetric Avatar: Surface-guided Neural Points for
Efficient and Photorealistic Volumetric Head Avatar [62.87222308616711]
We propose fullname (name), a method that adopts the neural point representation and the neural volume rendering process.
Specifically, the neural points are strategically constrained around the surface of the target expression via a high-resolution UV displacement map.
By design, our name is better equipped to handle topologically changing regions and thin structures while also ensuring accurate expression control when animating avatars.
arXiv Detail & Related papers (2023-07-11T03:40:10Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - A Multi-viewpoint Outdoor Dataset for Human Action Recognition [3.522154868524807]
We present a multi-viewpoint outdoor action recognition dataset collected from YouTube and our own drone.
The dataset consists of 20 dynamic human action classes, 2324 video clips and 503086 frames.
The overall baseline action recognition accuracy is 74.0%.
arXiv Detail & Related papers (2021-10-07T14:50:43Z) - A Benchmark for Gait Recognition under Occlusion Collected by
Multi-Kinect SDAS [6.922350076348358]
We collect a new gait database called OG RGB+D database, which breaks through the limitation of other gait databases.
Azure Kinect DK can simultaneously collect multimodal data to support different types of gait recognition algorithms.
We propose a gait recognition method SkeletonGait based on human dual skeleton model.
arXiv Detail & Related papers (2021-07-19T16:01:18Z) - Robust Data Hiding Using Inverse Gradient Attention [82.73143630466629]
In the data hiding task, each pixel of cover images should be treated differently since they have divergent tolerabilities.
We propose a novel deep data hiding scheme with Inverse Gradient Attention (IGA), combing the ideas of adversarial learning and attention mechanism.
Empirically, extensive experiments show that the proposed model outperforms the state-of-the-art methods on two prevalent datasets.
arXiv Detail & Related papers (2020-11-21T19:08:23Z) - Parallax Motion Effect Generation Through Instance Segmentation And
Depth Estimation [1.8350736912359715]
We propose an algorithm for generating parallax motion effects from a single image.
We show that the PyD-Net network (depth estimation) combined with Mask R-CNN or FBNet networks can produce parallax motion effects with good visual quality.
arXiv Detail & Related papers (2020-10-06T12:56:59Z) - 3DFCNN: Real-Time Action Recognition using 3D Deep Neural Networks with
Raw Depth Information [1.3854111346209868]
This paper describes an approach for real-time human action recognition from raw depth image-sequences, provided by an RGB-D camera.
The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes-temporal patterns from depth sequences without %any costly pre-processing.
arXiv Detail & Related papers (2020-06-13T23:24:07Z) - A Deep Learning Approach for Motion Forecasting Using 4D OCT Data [69.62333053044712]
We propose 4D-temporal deep learning for end-to-end motion forecasting and estimation using a stream of OCT volumes.
Our best performing 4D method achieves motion forecasting with an overall average correlation of 97.41%, while also improving motion estimation performance by a factor of 2.5 compared to a previous 3D approach.
arXiv Detail & Related papers (2020-04-21T15:59:53Z) - Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition [55.15661254072032]
We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER)
We first propose a novel augmentation method to combat the data limitation problem for deep learning.
We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
arXiv Detail & Related papers (2020-02-08T13:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.