mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D,
and Inertial Sensors
- URL: http://arxiv.org/abs/2210.08394v1
- Date: Sat, 15 Oct 2022 23:08:44 GMT
- Title: mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D,
and Inertial Sensors
- Authors: Sizhe An, Yin Li, Umit Ogras
- Abstract summary: We present mRI, a multi-modal 3D human pose estimation dataset with mmWave, RGB-D, and Inertial Sensors.
Our dataset consists of over 160k synchronized frames from 20 subjects performing rehabilitation exercises.
- Score: 6.955796938573367
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The ability to estimate 3D human body pose and movement, also known as human
pose estimation (HPE), enables many applications for home-based health
monitoring, such as remote rehabilitation training. Several possible solutions
have emerged using sensors ranging from RGB cameras, depth sensors,
millimeter-Wave (mmWave) radars, and wearable inertial sensors. Despite
previous efforts on datasets and benchmarks for HPE, few dataset exploits
multiple modalities and focuses on home-based health monitoring. To bridge the
gap, we present mRI, a multi-modal 3D human pose estimation dataset with
mmWave, RGB-D, and Inertial Sensors. Our dataset consists of over 160k
synchronized frames from 20 subjects performing rehabilitation exercises and
supports the benchmarks of HPE and action detection. We perform extensive
experiments using our dataset and delineate the strength of each modality. We
hope that the release of mRI can catalyze the research in pose estimation,
multi-modal learning, and action understanding, and more importantly facilitate
the applications of home-based health monitoring.
Related papers
- Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction [50.534007259536715]
We present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4768 3D objects.
We propose MinD-3D, a novel framework designed to decode 3D visual information from fMRI signals.
arXiv Detail & Related papers (2024-09-17T16:13:59Z) - MinD-3D: Reconstruct High-quality 3D objects in Human Brain [50.534007259536715]
Recon3DMind is an innovative task aimed at reconstructing 3D visuals from Functional Magnetic Resonance Imaging (fMRI) signals.
We present the fMRI-Shape dataset, which includes data from 14 participants and features 360-degree videos of 3D objects.
We propose MinD-3D, a novel and effective three-stage framework specifically designed to decode the brain's 3D visual information from fMRI signals.
arXiv Detail & Related papers (2023-12-12T18:21:36Z) - Aria-NeRF: Multimodal Egocentric View Synthesis [17.0554791846124]
We seek to accelerate research in developing rich, multimodal scene models trained from egocentric data, based on differentiable volumetric ray-tracing inspired by Neural Radiance Fields (NeRFs)
This dataset offers a comprehensive collection of sensory data, featuring RGB images, eye-tracking camera footage, audio recordings from a microphone, atmospheric pressure readings from a barometer, positional coordinates from GPS, and information from dual-frequency IMU datasets (1kHz and 800Hz)
The diverse data modalities and the real-world context captured within this dataset serve as a robust foundation for furthering our understanding of human behavior and enabling more immersive and intelligent experiences in
arXiv Detail & Related papers (2023-11-11T01:56:35Z) - Multisensory extended reality applications offer benefits for volumetric biomedical image analysis in research and medicine [2.46537907738351]
3D data from high-resolution volumetric imaging is a central resource for diagnosis and treatment in modern medicine.
Recent research used extended reality (XR) for perceiving 3D images with visual depth perception and touch but used restrictive haptic devices.
In this study, 24 experts for biomedical images in research and medicine explored 3D medical shapes with 3 applications.
arXiv Detail & Related papers (2023-11-07T13:37:47Z) - MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless
Sensing [45.29593826502026]
MM-Fi is the first multi-modal non-intrusive 4D human dataset with 27 daily or rehabilitation action categories.
MM-Fi consists of over 320k synchronized frames of five modalities from 40 human subjects.
arXiv Detail & Related papers (2023-05-12T05:18:52Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for
Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians.
Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin.
Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z) - HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling [83.57675975092496]
HuMMan is a large-scale multi-modal 4D human dataset with 1000 human subjects, 400k sequences and 60M frames.
HuMMan has several appealing properties: 1) multi-modal data and annotations including color images, point clouds, keypoints, SMPL parameters, and textured meshes.
arXiv Detail & Related papers (2022-04-28T17:54:25Z) - CZU-MHAD: A multimodal dataset for human action recognition utilizing a
depth camera and 10 wearable inertial sensors [1.0742675209112622]
CZU-MHAD (Changzhou University: a comprehensive multi-modal human action dataset) consists of 22 actions and three modals temporal synchronized data.
These modals include depth videos and skeleton positions from a kinect v2 camera, and inertial signals from 10 wearable sensors.
arXiv Detail & Related papers (2022-02-07T15:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.