Diffusion Model is a Good Pose Estimator from 3D RF-Vision
- URL: http://arxiv.org/abs/2403.16198v2
- Date: Mon, 22 Jul 2024 03:27:30 GMT
- Title: Diffusion Model is a Good Pose Estimator from 3D RF-Vision
- Authors: Junqiao Fan, Jianfei Yang, Yuecong Xu, Lihua Xie,
- Abstract summary: Human pose estimation (HPE) from Radio Frequency vision (RF-vision) performs human sensing using RF signals.
mmWave radar has emerged as a promising RF-vision sensor, providing radar point clouds by processing RF signals.
This work proposes mmDiff, a novel diffusion-based pose estimator tailored for noisy radar data.
- Score: 32.72703340013302
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human pose estimation (HPE) from Radio Frequency vision (RF-vision) performs human sensing using RF signals that penetrate obstacles without revealing privacy (e.g., facial information). Recently, mmWave radar has emerged as a promising RF-vision sensor, providing radar point clouds by processing RF signals. However, the mmWave radar has a limited resolution with severe noise, leading to inaccurate and inconsistent human pose estimation. This work proposes mmDiff, a novel diffusion-based pose estimator tailored for noisy radar data. Our approach aims to provide reliable guidance as conditions to diffusion models. Two key challenges are addressed by mmDiff: (1) miss-detection of parts of human bodies, which is addressed by a module that isolates feature extraction from different body parts, and (2) signal inconsistency due to environmental interference, which is tackled by incorporating prior knowledge of body structure and motion. Several modules are designed to achieve these goals, whose features work as the conditions for the subsequent diffusion model, eliminating the miss-detection and instability of HPE based on RF-vision. Extensive experiments demonstrate that mmDiff outperforms existing methods significantly, achieving state-of-the-art performances on public datasets.
Related papers
- UniBEVFusion: Unified Radar-Vision BEVFusion for 3D Object Detection [2.123197540438989]
Many radar-vision fusion models treat radar as a sparse LiDAR, underutilizing radar-specific information.
We propose the Radar Depth Lift-Splat-Shoot (RDL) module, which integrates radar-specific data into the depth prediction process.
We also introduce a Unified Feature Fusion (UFF) approach that extracts BEV features across different modalities.
arXiv Detail & Related papers (2024-09-23T06:57:27Z) - RF Challenge: The Data-Driven Radio Frequency Signal Separation Challenge [66.33067693672696]
This paper addresses the critical problem of interference rejection in radio-frequency (RF) signals using a novel, data-driven approach.
First, we present an insightful signal model that serves as a foundation for developing and analyzing interference rejection algorithms.
Second, we introduce the RF Challenge, a publicly available dataset featuring diverse RF signals along with code templates.
Third, we propose novel AI-based rejection algorithms, specifically architectures like UNet and WaveNet, and evaluate their performance across eight different signal mixture types.
arXiv Detail & Related papers (2024-09-13T13:53:41Z) - Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model [4.269423698485249]
This paper proposes a novel approach to dense and accurate mmWave radar point cloud construction via cross-modal learning.
Specifically, we introduce diffusion models, which possess state-of-the-art performance in generative modeling, to predict LiDAR-like point clouds from paired raw radar data.
We validate the proposed method through extensive benchmark comparisons and real-world experiments, demonstrating its superior performance and generalization ability.
arXiv Detail & Related papers (2024-03-13T12:20:20Z) - Joint Attention-Guided Feature Fusion Network for Saliency Detection of
Surface Defects [69.39099029406248]
We propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.
JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features.
Experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T08:10:16Z) - Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs)
Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV.
Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z) - Physical-Layer Semantic-Aware Network for Zero-Shot Wireless Sensing [74.12670841657038]
Device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications.
Data heterogeneity in wireless signals and data privacy regulation of distributed sensing have been considered as the major challenges that hinder the wide applications of wireless sensing in large area networking systems.
We propose a novel zero-shot wireless sensing solution that allows models constructed in one or a limited number of locations to be directly transferred to other locations without any labeled data.
arXiv Detail & Related papers (2023-12-08T13:50:30Z) - Denoising Diffusion Models for Plug-and-Play Image Restoration [135.6359475784627]
This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework.
Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models.
arXiv Detail & Related papers (2023-05-15T20:24:38Z) - NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from
3D-aware Diffusion [107.67277084886929]
Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input.
We propose NerfDiff, which addresses this issue by distilling the knowledge of a 3D-aware conditional diffusion model (CDM) into NeRF through synthesizing and refining a set of virtual views at test time.
We further propose a novel NeRF-guided distillation algorithm that simultaneously generates 3D consistent virtual views from the CDM samples, and finetunes the NeRF based on the improved virtual views.
arXiv Detail & Related papers (2023-02-20T17:12:00Z) - HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar [30.51398364813315]
This paper introduces a novel human pose estimation benchmark, Human Pose with Millimeter Wave Radar (HuPR)
This dataset is created using cross-calibrated mmWave radar sensors and a monocular RGB camera for cross-modality training of radar-based human pose estimation.
arXiv Detail & Related papers (2022-10-22T22:28:40Z) - RFMask: A Simple Baseline for Human Silhouette Segmentation with Radio
Signals [9.663978351279422]
We propose to utilize the radio signals, which can traverse obstacles and are unaffected by the lighting conditions to achieve silhouette segmentation.
The proposed RFMask framework is composed of three modules.
We collect a dataset containing 804,760 radio frames and 402,380 camera frames with human activities under various scenes.
arXiv Detail & Related papers (2022-01-25T08:43:01Z) - MDPose: Human Skeletal Motion Reconstruction Using WiFi Micro-Doppler
Signatures [4.92674421365689]
We propose MDPose, a novel framework for human skeletal motion reconstruction based on WiFi micro-Doppler signatures.
It provides an effective solution to track human activities by reconstructing a skeleton model with 17 key points.
MDPose outperforms state-of-the-art RF-based pose estimation systems.
arXiv Detail & Related papers (2022-01-11T21:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.