3DFCNN: Real-Time Action Recognition using 3D Deep Neural Networks with
Raw Depth Information
- URL: http://arxiv.org/abs/2006.07743v1
- Date: Sat, 13 Jun 2020 23:24:07 GMT
- Title: 3DFCNN: Real-Time Action Recognition using 3D Deep Neural Networks with
Raw Depth Information
- Authors: Adrian Sanchez-Caballero, Sergio de L\'opez-Diz, David
Fuentes-Jimenez, Cristina Losada-Guti\'errez, Marta Marr\'on-Romera, David
Casillas-Perez, Mohammad Ibrahim Sarker
- Abstract summary: This paper describes an approach for real-time human action recognition from raw depth image-sequences, provided by an RGB-D camera.
The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes-temporal patterns from depth sequences without %any costly pre-processing.
- Score: 1.3854111346209868
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human actions recognition is a fundamental task in artificial vision, that
has earned a great importance in recent years due to its multiple applications
in different areas. %, such as the study of human behavior, security or video
surveillance. In this context, this paper describes an approach for real-time
human action recognition from raw depth image-sequences, provided by an RGB-D
camera. The proposal is based on a 3D fully convolutional neural network, named
3DFCNN, which automatically encodes spatio-temporal patterns from depth
sequences without %any costly pre-processing. Furthermore, the described 3D-CNN
allows %automatic features extraction and actions classification from the
spatial and temporal encoded information of depth sequences. The use of depth
data ensures that action recognition is carried out protecting people's
privacy% allows recognizing the actions carried out by people, protecting their
privacy%\sout{of them} , since their identities can not be recognized from
these data. %\st{ from depth images.} 3DFCNN has been evaluated and its results
compared to those from other state-of-the-art methods within three widely used
%large-scale NTU RGB+D datasets, with different characteristics (resolution,
sensor type, number of views, camera location, etc.). The obtained results
allows validating the proposal, concluding that it outperforms several
state-of-the-art approaches based on classical computer vision techniques.
Furthermore, it achieves action recognition accuracy comparable to deep
learning based state-of-the-art methods with a lower computational cost, which
allows its use in real-time applications.
Related papers
- Research on Image Recognition Technology Based on Multimodal Deep Learning [24.259653149898167]
This project investigates the human multi-modal behavior identification algorithm utilizing deep neural networks.
The performance of the suggested algorithm was evaluated using the MSR3D data set.
arXiv Detail & Related papers (2024-05-06T01:05:21Z) - Depth Map Denoising Network and Lightweight Fusion Network for Enhanced
3D Face Recognition [61.27785140017464]
We introduce an innovative Depth map denoising network (DMDNet) based on the Denoising Implicit Image Function (DIIF) to reduce noise.
We further design a powerful recognition network called Lightweight Depth and Normal Fusion network (LDNFNet) to learn unique and complementary features between different modalities.
arXiv Detail & Related papers (2024-01-01T10:46:42Z) - SpATr: MoCap 3D Human Action Recognition based on Spiral Auto-encoder and Transformer Network [1.4732811715354455]
We introduce a novel approach for 3D human action recognition, denoted as SpATr (Spiral Auto-encoder and Transformer Network)
A lightweight auto-encoder, based on spiral convolutions, is employed to extract spatial geometrical features from each 3D mesh.
The proposed method is evaluated on three prominent 3D human action datasets: Babel, MoVi, and BMLrub.
arXiv Detail & Related papers (2023-06-30T11:49:00Z) - Advancing 3D finger knuckle recognition via deep feature learning [51.871256510747465]
Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience.
Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales.
This paper advances this approach by investigating the possibility of learning a discriminative feature vector with the least possible dimension for representing 3D finger knuckle images.
arXiv Detail & Related papers (2023-01-07T20:55:16Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - VR3Dense: Voxel Representation Learning for 3D Object Detection and
Monocular Dense Depth Reconstruction [0.951828574518325]
We introduce a method for jointly training 3D object detection and monocular dense depth reconstruction neural networks.
It takes as inputs, a LiDAR point-cloud, and a single RGB image during inference and produces object pose predictions as well as a densely reconstructed depth map.
While our object detection is trained in a supervised manner, the depth prediction network is trained with both self-supervised and supervised loss functions.
arXiv Detail & Related papers (2021-04-13T04:25:54Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image.
Our neural network processes a depth image and outputs a likelihood map in image coordinates.
We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z) - Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition [55.15661254072032]
We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER)
We first propose a novel augmentation method to combat the data limitation problem for deep learning.
We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
arXiv Detail & Related papers (2020-02-08T13:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.