A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities
- URL: http://arxiv.org/abs/2409.09678v1
- Date: Sun, 15 Sep 2024 10:04:44 GMT
- Title: A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities
- Authors: Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah1, Satoshi Nishimura,
- Abstract summary: Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action.
HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals.
This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024.
- Score: 2.916558661202724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024, focusing on machine learning (ML) and deep learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human-object interactions, and activity detection. Our survey includes a detailed dataset description for each modality and a summary of the latest HAR systems, offering comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR.
Related papers
- A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities [1.6144710323800757]
Hand Gesture Recognition (HGR) systems enhance natural, efficient, and authentic human-computer interaction.
Despite significant progress, automatic and precise identification of hand gestures remains a considerable challenge in computer vision.
This paper provides a comprehensive review of HGR techniques and data modalities from 2014 to 2024, exploring advancements in sensor technology and computer vision.
arXiv Detail & Related papers (2024-08-10T04:40:01Z) - A Survey on Multimodal Wearable Sensor-based Human Action Recognition [15.054052500762559]
Wearable Sensor-based Human Activity Recognition (WSHAR) emerges as a promising assistive technology to support the daily lives of older individuals.
Recent surveys in WSHAR have been limited, focusing either solely on deep learning approaches or on a single sensor modality.
In this study, we present a comprehensive survey on how to leverage multimodal learning to WSHAR domain for newcomers and researchers.
arXiv Detail & Related papers (2024-04-14T18:43:16Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal
Human Activity Recognition [1.869225486385596]
We explore the hypothesis that leveraging multiple modalities can lead to better recognition.
We extend a number of recent contrastive self-supervised approaches for the task of Human Activity Recognition.
We propose a flexible, general-purpose framework for performing multimodal self-supervised learning.
arXiv Detail & Related papers (2022-05-20T10:39:16Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Human Action Recognition from Various Data Modalities: A Review [37.07491839026713]
Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action.
HAR has a wide range of applications, and has been attracting increasing attention in the field of computer vision.
We present a survey of recent progress in deep learning methods for HAR based on the type of input data modality.
arXiv Detail & Related papers (2020-12-22T07:37:43Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Learning-to-Learn Personalised Human Activity Recognition Models [1.5087842661221904]
We present a meta-learning methodology for learning to learn personalised HAR models for HAR.
We introduce two algorithms, Personalised MAML and Personalised Relation Networks inspired by existing Meta-Learning algorithms.
A comparative study shows significant performance improvements against the state-of-the-art Deep Learning algorithms and the Few-shot Meta-Learning algorithms in multiple HAR domains.
arXiv Detail & Related papers (2020-06-12T21:11:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.