Human Action Recognition from Various Data Modalities: A Review
- URL: http://arxiv.org/abs/2012.11866v3
- Date: Fri, 29 Jan 2021 12:13:25 GMT
- Title: Human Action Recognition from Various Data Modalities: A Review
- Authors: Zehua Sun, Jun Liu, Qiuhong Ke, Hossein Rahmani, Mohammed Bennamoun
and Gang Wang
- Abstract summary: Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action.
HAR has a wide range of applications, and has been attracting increasing attention in the field of computer vision.
We present a survey of recent progress in deep learning methods for HAR based on the type of input data modality.
- Score: 37.07491839026713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human Action Recognition (HAR) aims to understand human behavior and assign a
label to each action. It has a wide range of applications, and therefore has
been attracting increasing attention in the field of computer vision. Human
actions can be represented using various data modalities, such as RGB,
skeleton, depth, infrared, point cloud, event stream, audio, acceleration,
radar, and WiFi signal, which encode different sources of useful yet distinct
information and have various advantages depending on the application scenarios.
Consequently, lots of existing works have attempted to investigate different
types of approaches for HAR using various modalities. In this paper, we present
a comprehensive survey of recent progress in deep learning methods for HAR
based on the type of input data modality. Specifically, we review the current
mainstream deep learning methods for single data modalities and multiple data
modalities, including the fusion-based and the co-learning-based frameworks. We
also present comparative results on several benchmark datasets for HAR,
together with insightful observations and inspiring future research directions.
Related papers
- A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities [2.916558661202724]
Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action.
HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals.
This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024.
arXiv Detail & Related papers (2024-09-15T10:04:44Z) - Cross-Domain HAR: Few Shot Transfer Learning for Human Activity
Recognition [0.2944538605197902]
We present an approach for economic use of publicly available labeled HAR datasets for effective transfer learning.
We introduce a novel transfer learning framework, Cross-Domain HAR, which follows the teacher-student self-training paradigm.
We demonstrate the effectiveness of our approach for practically relevant few shot activity recognition scenarios.
arXiv Detail & Related papers (2023-10-22T19:13:25Z) - Unified Demonstration Retriever for In-Context Learning [56.06473069923567]
Unified Demonstration Retriever (textbfUDR) is a single model to retrieve demonstrations for a wide range of tasks.
We propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates.
Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines.
arXiv Detail & Related papers (2023-05-07T16:07:11Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal
Human Activity Recognition [1.869225486385596]
We explore the hypothesis that leveraging multiple modalities can lead to better recognition.
We extend a number of recent contrastive self-supervised approaches for the task of Human Activity Recognition.
We propose a flexible, general-purpose framework for performing multimodal self-supervised learning.
arXiv Detail & Related papers (2022-05-20T10:39:16Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Invariant Feature Learning for Sensor-based Human Activity Recognition [11.334750079923428]
We present an invariant feature learning framework (IFLF) that extracts common information shared across subjects and devices.
Experiments demonstrated that IFLF is effective in handling both subject and device diversion across popular open datasets and an in-house dataset.
arXiv Detail & Related papers (2020-12-14T21:56:17Z) - Learning-to-Learn Personalised Human Activity Recognition Models [1.5087842661221904]
We present a meta-learning methodology for learning to learn personalised HAR models for HAR.
We introduce two algorithms, Personalised MAML and Personalised Relation Networks inspired by existing Meta-Learning algorithms.
A comparative study shows significant performance improvements against the state-of-the-art Deep Learning algorithms and the Few-shot Meta-Learning algorithms in multiple HAR domains.
arXiv Detail & Related papers (2020-06-12T21:11:59Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.