MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless
Sensing
- URL: http://arxiv.org/abs/2305.10345v2
- Date: Mon, 25 Sep 2023 02:47:48 GMT
- Title: MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless
Sensing
- Authors: Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu,
Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, Lihua Xie
- Abstract summary: MM-Fi is the first multi-modal non-intrusive 4D human dataset with 27 daily or rehabilitation action categories.
MM-Fi consists of over 320k synchronized frames of five modalities from 40 human subjects.
- Score: 45.29593826502026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 4D human perception plays an essential role in a myriad of applications, such
as home automation and metaverse avatar simulation. However, existing solutions
which mainly rely on cameras and wearable devices are either privacy intrusive
or inconvenient to use. To address these issues, wireless sensing has emerged
as a promising alternative, leveraging LiDAR, mmWave radar, and WiFi signals
for device-free human sensing. In this paper, we propose MM-Fi, the first
multi-modal non-intrusive 4D human dataset with 27 daily or rehabilitation
action categories, to bridge the gap between wireless sensing and high-level
human perception tasks. MM-Fi consists of over 320k synchronized frames of five
modalities from 40 human subjects. Various annotations are provided to support
potential sensing tasks, e.g., human pose estimation and action recognition.
Extensive experiments have been conducted to compare the sensing capacity of
each or several modalities in terms of multiple tasks. We envision that MM-Fi
can contribute to wireless sensing research with respect to action recognition,
human pose estimation, multi-modal learning, cross-modal supervision, and
interdisciplinary healthcare research.
Related papers
- X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing [14.549639729808717]
Current human sensing primarily depends on cameras and LiDAR, each of which has its own strengths and limitations.
Existing multi-modal fusion solutions are typically designed for fixed modality combinations.
We propose a modality-invariant foundation model for all modalities, X-Fi, to address this issue.
arXiv Detail & Related papers (2024-10-14T05:23:12Z) - Physical-Layer Semantic-Aware Network for Zero-Shot Wireless Sensing [74.12670841657038]
Device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications.
Data heterogeneity in wireless signals and data privacy regulation of distributed sensing have been considered as the major challenges that hinder the wide applications of wireless sensing in large area networking systems.
We propose a novel zero-shot wireless sensing solution that allows models constructed in one or a limited number of locations to be directly transferred to other locations without any labeled data.
arXiv Detail & Related papers (2023-12-08T13:50:30Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D,
and Inertial Sensors [6.955796938573367]
We present mRI, a multi-modal 3D human pose estimation dataset with mmWave, RGB-D, and Inertial Sensors.
Our dataset consists of over 160k synchronized frames from 20 subjects performing rehabilitation exercises.
arXiv Detail & Related papers (2022-10-15T23:08:44Z) - Cross Vision-RF Gait Re-identification with Low-cost RGB-D Cameras and
mmWave Radars [15.662787088335618]
This work studies the problem of cross-modal human re-identification (ReID)
We propose the first-of-its-kind vision-RF system for cross-modal multi-person ReID at the same time.
Our proposed system is able to achieve 92.5% top-1 accuracy and 97.5% top-5 accuracy out of 56 volunteers.
arXiv Detail & Related papers (2022-07-16T10:34:25Z) - HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling [83.57675975092496]
HuMMan is a large-scale multi-modal 4D human dataset with 1000 human subjects, 400k sequences and 60M frames.
HuMMan has several appealing properties: 1) multi-modal data and annotations including color images, point clouds, keypoints, SMPL parameters, and textured meshes.
arXiv Detail & Related papers (2022-04-28T17:54:25Z) - Domain and Modality Gaps for LiDAR-based Person Detection on Mobile
Robots [91.01747068273666]
This paper studies existing LiDAR-based person detectors with a particular focus on mobile robot scenarios.
Experiments revolve around the domain gap between driving and mobile robot scenarios, as well as the modality gap between 3D and 2D LiDAR sensors.
Results provide practical insights into LiDAR-based person detection and facilitate informed decisions for relevant mobile robot designs and applications.
arXiv Detail & Related papers (2021-06-21T16:35:49Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.