CZU-MHAD: A multimodal dataset for human action recognition utilizing a
depth camera and 10 wearable inertial sensors
- URL: http://arxiv.org/abs/2202.03283v1
- Date: Mon, 7 Feb 2022 15:17:08 GMT
- Title: CZU-MHAD: A multimodal dataset for human action recognition utilizing a
depth camera and 10 wearable inertial sensors
- Authors: Xin Chao, Zhenjie Hou, Yujian Mo
- Abstract summary: CZU-MHAD (Changzhou University: a comprehensive multi-modal human action dataset) consists of 22 actions and three modals temporal synchronized data.
These modals include depth videos and skeleton positions from a kinect v2 camera, and inertial signals from 10 wearable sensors.
- Score: 1.0742675209112622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human action recognition has been widely used in many fields of life, and
many human action datasets have been published at the same time. However, most
of the multi-modal databases have some shortcomings in the layout and number of
sensors, which cannot fully represent the action features. Regarding the
problems, this paper proposes a freely available dataset, named CZU-MHAD
(Changzhou University: a comprehensive multi-modal human action dataset). It
consists of 22 actions and three modals temporal synchronized data. These
modals include depth videos and skeleton positions from a kinect v2 camera, and
inertial signals from 10 wearable sensors. Compared with single modal sensors,
multi-modal sensors can collect different modal data, so the use of multi-modal
sensors can describe actions more accurately. Moreover, CZU-MHAD obtains the
3-axis acceleration and 3-axis angular velocity of 10 main motion joints by
binding inertial sensors to them, and these data were captured at the same
time. Experimental results are provided to show that this dataset can be used
to study structural relationships between different parts of the human body
when performing actions and fusion approaches that involve multi-modal sensor
data.
Related papers
- Motion Capture from Inertial and Vision Sensors [60.5190090684795]
MINIONS is a large-scale Motion capture dataset collected from INertial and visION Sensors.
We conduct experiments on multi-modal motion capture using a monocular camera and very few IMUs.
arXiv Detail & Related papers (2024-07-23T09:41:10Z) - Disentangling Imperfect: A Wavelet-Infused Multilevel Heterogeneous
Network for Human Activity Recognition in Flawed Wearable Sensor Data [30.213716132980874]
We propose a multilevel heterogeneous neural network, called MHNN, for sensor data analysis.
We utilize multilevel discrete wavelet decomposition to extract multi-resolution features from sensor data.
We equip the proposed model with heterogeneous feature extractors that enable the learning of multi-scale features.
arXiv Detail & Related papers (2024-01-26T06:08:49Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - Robust Multimodal Fusion for Human Activity Recognition [5.858726030608716]
We propose Centaur, a multimodal fusion model for human activity recognition (HAR) that is robust to data quality issues.
A Centaur data cleaning module outperforms 2 state-of-the-art autoencoder-based models and its multimodal fusion module outperforms 4 strong baselines.
Compared to 2 related robust fusion architectures, Centaur is more robust, achieving 11.59-17.52% higher accuracy in HAR.
arXiv Detail & Related papers (2023-03-08T14:56:11Z) - HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for
Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians.
Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin.
Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z) - mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D,
and Inertial Sensors [6.955796938573367]
We present mRI, a multi-modal 3D human pose estimation dataset with mmWave, RGB-D, and Inertial Sensors.
Our dataset consists of over 160k synchronized frames from 20 subjects performing rehabilitation exercises.
arXiv Detail & Related papers (2022-10-15T23:08:44Z) - DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and
Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.
We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis.
We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z) - HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling [83.57675975092496]
HuMMan is a large-scale multi-modal 4D human dataset with 1000 human subjects, 400k sequences and 60M frames.
HuMMan has several appealing properties: 1) multi-modal data and annotations including color images, point clouds, keypoints, SMPL parameters, and textured meshes.
arXiv Detail & Related papers (2022-04-28T17:54:25Z) - Learning Online Multi-Sensor Depth Fusion [100.84519175539378]
SenFuNet is a depth fusion approach that learns sensor-specific noise and outlier statistics.
We conduct experiments with various sensor combinations on the real-world CoRBS and Scene3D datasets.
arXiv Detail & Related papers (2022-04-07T10:45:32Z) - WaveGlove: Transformer-based hand gesture recognition using multiple
inertial sensors [0.0]
Hand Gesture Recognition (HGR) based on inertial data has grown considerably in recent years.
In this work we explore the benefits of using multiple inertial sensors.
arXiv Detail & Related papers (2021-05-04T20:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.