Related papers: Video-based Human Action Recognition using Deep Learning: A Review

Video-based Human Action Recognition using Deep Learning: A Review

URL: http://arxiv.org/abs/2208.03775v1
Date: Sun, 7 Aug 2022 17:12:12 GMT
Title: Video-based Human Action Recognition using Deep Learning: A Review
Authors: Hieu H. Pham, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio A. Velastin
Abstract summary: Human action recognition is an important application domain in computer vision. Deep learning has been given particular attention by the computer vision community. This paper presents an overview of the current state-of-the-art in action recognition using video analysis with deep learning techniques.
Score: 4.976815699476327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human action recognition is an important application domain in computer vision. Its primary aim is to accurately describe human actions and their interactions from a previously unseen data sequence acquired by sensors. The ability to recognize, understand, and predict complex human actions enables the construction of many important applications such as intelligent surveillance systems, human-computer interfaces, health care, security, and military applications. In recent years, deep learning has been given particular attention by the computer vision community. This paper presents an overview of the current state-of-the-art in action recognition using video analysis with deep learning techniques. We present the most important deep learning models for recognizing human actions, and analyze them to provide the current progress of deep learning algorithms applied to solve human action recognition problems in realistic videos highlighting their advantages and disadvantages. Based on the quantitative analysis using recognition accuracies reported in the literature, our study identifies state-of-the-art deep architectures in action recognition and then provides current trends and open problems for future works in this field.

Related papers

Emergent Active Perception and Dexterity of Simulated Humanoids from Visual Reinforcement Learning [69.71072181304066]
We introduce Perceptive Dexterous Control (PDC), a framework for vision-driven whole-body control with simulated humanoids.<n>PDC operates solely on egocentric vision for task specification, enabling object search, target placement, and skill selection through visual cues.<n>We show that training from scratch with reinforcement learning can produce emergent behaviors such as active search.
arXiv Detail & Related papers (2025-05-18T07:33:31Z)
A Critical Analysis on Machine Learning Techniques for Video-based Human Activity Recognition of Surveillance Systems: A Review [1.3693860189056777]
Upsurging abnormal activities in crowded locations urges the necessity for an intelligent surveillance system. Video-based human activity recognition has intrigued many researchers with its pressing issues. This paper provides a critical survey of video-based Human Activity Recognition (HAR) techniques.
arXiv Detail & Related papers (2024-09-01T14:43:57Z)
Emotion Recognition from the perspective of Activity Recognition [0.0]
Appraising human emotional states, behaviors, and reactions displayed in real-world settings can be accomplished using latent continuous dimensions. For emotion recognition systems to be deployed and integrated into real-world mobile and computing devices, we need to consider data collected in the world. We propose a novel three-stream end-to-end deep learning regression pipeline with an attention mechanism.
arXiv Detail & Related papers (2024-03-24T18:53:57Z)
Computer Vision for Primate Behavior Analysis in the Wild [61.08941894580172]
Video-based behavioral monitoring has great potential for transforming how we study animal cognition and behavior. There is still a fairly large gap between the exciting prospects and what can actually be achieved in practice today.
arXiv Detail & Related papers (2024-01-29T18:59:56Z)
Deep Neural Networks in Video Human Action Recognition: A Review [21.00217656391331]
Video behavior recognition is one of the most foundational tasks of computer vision. Deep neural networks are built for recognizing pixel-level information such as images with RGB, RGB-D, or optical flow formats. In our article, the performance of deep neural networks surpassed most of the techniques in the feature learning and extraction tasks.
arXiv Detail & Related papers (2023-05-25T03:54:41Z)
Context-Aware Sequence Alignment using 4D Skeletal Augmentation [67.05537307224525]
Temporal alignment of fine-grained human actions in videos is important for numerous applications in computer vision, robotics, and mixed reality. We propose a novel context-aware self-supervised learning architecture to align sequences of actions. Specifically, CASA employs self-attention and cross-attention mechanisms to incorporate the spatial and temporal context of human actions.
arXiv Detail & Related papers (2022-04-26T10:59:29Z)
Classifying Human Activities with Inertial Sensors: A Machine Learning Approach [0.0]
Human Activity Recognition (HAR) is an ongoing research topic. It has applications in medical support, sports, fitness, social networking, human-computer interfaces, senior care, entertainment, surveillance, and the list goes on. We examined and analyzed different Machine Learning and Deep Learning approaches for Human Activity Recognition using inertial sensor data of smartphones.
arXiv Detail & Related papers (2021-11-09T08:17:33Z)
Affect Analysis in-the-wild: Valence-Arousal, Expressions, Action Units and a Unified Framework [83.21732533130846]
The paper focuses on large in-the-wild databases, i.e., Aff-Wild and Aff-Wild2. It presents the design of two classes of deep neural networks trained with these databases. A novel multi-task and holistic framework is presented which is able to jointly learn and effectively generalize and perform affect recognition.
arXiv Detail & Related papers (2021-03-29T17:36:20Z)
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild. We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z)
A Grid-based Representation for Human Action Recognition [12.043574473965318]
Human action recognition (HAR) in videos is a fundamental research topic in computer vision. We propose a novel method for action recognition that encodes efficiently the most discriminative appearance information of an action. Our method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions.
arXiv Detail & Related papers (2020-10-17T18:25:00Z)
Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy. Deep neural networks have been used with great success in recognizing emotions. We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities [52.59080024266596]
We present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets. We then propose a new taxonomy to structure the deep methods by challenges.
arXiv Detail & Related papers (2020-01-21T09:55:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.