ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in
All Weather Conditions
- URL: http://arxiv.org/abs/2210.01346v3
- Date: Wed, 20 Sep 2023 05:01:45 GMT
- Title: ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in
All Weather Conditions
- Authors: Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng
Chen, Jiming Chen, Yuchi Huo, Qi Ye
- Abstract summary: We present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies robustly.
Our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods.
- Score: 23.146325482439988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D human reconstruction from RGB images achieves decent results in good
weather conditions but degrades dramatically in rough weather. Complementary,
mmWave radars have been employed to reconstruct 3D human joints and meshes in
rough weather. However, combining RGB and mmWave signals for robust all-weather
3D human reconstruction is still an open challenge, given the sparse nature of
mmWave and the vulnerability of RGB images. In this paper, we present
ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies
in all weather conditions robustly. Specifically, our ImmFusion consists of
image and point backbones for token feature extraction and a Transformer module
for token fusion. The image and point backbones refine global and local
features from original data, and the Fusion Transformer Module aims for
effective information fusion of two modalities by dynamically selecting
informative tokens. Extensive experiments on a large-scale dataset, mmBody,
captured in various environments demonstrate that ImmFusion can efficiently
utilize the information of two modalities to achieve a robust 3D human body
reconstruction in all weather conditions. In addition, our method's accuracy is
significantly superior to that of state-of-the-art Transformer-based
LiDAR-camera fusion methods.
Related papers
- FOF-X: Towards Real-time Detailed Human Reconstruction from a Single Image [68.84221452621674]
We introduce FOF-X for real-time reconstruction of detailed human geometry from a single image.
FOF-X avoids the performance degradation caused by texture and lighting.
We enhance the inter-conversion algorithms between FOF and mesh representations with a Laplacian constraint and an automaton-based discontinuity matcher.
arXiv Detail & Related papers (2024-12-08T14:46:29Z) - Towards Weather-Robust 3D Human Body Reconstruction: Millimeter-Wave Radar-Based Dataset, Benchmark, and Multi-Modal Fusion [13.082760040398147]
3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather.
mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather.
We design ImmFusion, the first mmWave-RGB fusion solution to robustly reconstruct 3D human bodies in various weather conditions.
arXiv Detail & Related papers (2024-09-07T15:06:30Z) - FlatFusion: Delving into Details of Sparse Transformer-based Camera-LiDAR Fusion for Autonomous Driving [63.96049803915402]
The integration of data from diverse sensor modalities constitutes a prevalent methodology within the ambit of autonomous driving scenarios.
Recent advancements in efficient point cloud transformers have underscored the efficacy of integrating information in sparse formats.
In this paper, we conduct a comprehensive exploration of design choices for Transformer-based sparse cameraLiDAR fusion.
arXiv Detail & Related papers (2024-08-13T11:46:32Z) - Attentive Multimodal Fusion for Optical and Scene Flow [24.08052492109655]
Existing methods typically rely solely on RGB images or fuse the modalities at later stages.
We propose a novel deep neural network approach named FusionRAFT, which enables early-stage information fusion between sensor modalities.
Our approach exhibits improved robustness in the presence of noise and low-lighting conditions that affect the RGB images.
arXiv Detail & Related papers (2023-07-28T04:36:07Z) - mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for
Millimeter Wave Radar [10.610455816814985]
Millimeter Wave (mmWave) Radar is gaining popularity as it can work in adverse environments like smoke, rain, snow, poor lighting, etc.
Prior work has explored the possibility of reconstructing 3D skeletons or meshes from the noisy and sparse mmWave Radar signals.
This dataset consists of synchronized and calibrated mmWave radar point clouds and RGB(D) images in different scenes and skeleton/mesh annotations for humans in the scenes.
arXiv Detail & Related papers (2022-09-12T08:00:31Z) - Mirror Complementary Transformer Network for RGB-thermal Salient Object
Detection [16.64781797503128]
RGB-thermal object detection (RGB-T SOD) aims to locate the common prominent objects of an aligned visible and thermal infrared image pair.
In this paper, we propose a novel mirror complementary Transformer network (MCNet) for RGB-T SOD.
Experiments on benchmark and VT723 datasets show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-07T20:26:09Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors [52.38220261632204]
Flat facial surfaces frequently occur in the PIFu-based reconstruction results.
We propose a two-scale PIFu representation to enhance the quality of the reconstructed facial details.
Experiments demonstrate the effectiveness of our approach in vivid facial details and deforming body shapes.
arXiv Detail & Related papers (2021-12-03T18:46:49Z) - Transformer-based Network for RGB-D Saliency Detection [82.6665619584628]
Key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities.
We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement.
Our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.
arXiv Detail & Related papers (2021-12-01T15:53:58Z) - VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View
Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion.
We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.