Related papers: ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

URL: http://arxiv.org/abs/2210.01346v3
Date: Wed, 20 Sep 2023 05:01:45 GMT
Title: ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Authors: Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng Chen, Jiming Chen, Yuchi Huo, Qi Ye
Abstract summary: We present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies robustly. Our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods.
Score: 23.146325482439988
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images. In this paper, we present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies in all weather conditions robustly. Specifically, our ImmFusion consists of image and point backbones for token feature extraction and a Transformer module for token fusion. The image and point backbones refine global and local features from original data, and the Fusion Transformer Module aims for effective information fusion of two modalities by dynamically selecting informative tokens. Extensive experiments on a large-scale dataset, mmBody, captured in various environments demonstrate that ImmFusion can efficiently utilize the information of two modalities to achieve a robust 3D human body reconstruction in all weather conditions. In addition, our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods.

Related papers

FOF-X: Towards Real-time Detailed Human Reconstruction from a Single Image [68.84221452621674]
We introduce FOF-X for real-time reconstruction of detailed human geometry from a single image. FOF-X avoids the performance degradation caused by texture and lighting. We enhance the inter-conversion algorithms between FOF and mesh representations with a Laplacian constraint and an automaton-based discontinuity matcher.
arXiv Detail & Related papers (2024-12-08T14:46:29Z)
Towards Weather-Robust 3D Human Body Reconstruction: Millimeter-Wave Radar-Based Dataset, Benchmark, and Multi-Modal Fusion [13.082760040398147]
3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. We design ImmFusion, the first mmWave-RGB fusion solution to robustly reconstruct 3D human bodies in various weather conditions.
arXiv Detail & Related papers (2024-09-07T15:06:30Z)
FlatFusion: Delving into Details of Sparse Transformer-based Camera-LiDAR Fusion for Autonomous Driving [63.96049803915402]
The integration of data from diverse sensor modalities constitutes a prevalent methodology within the ambit of autonomous driving scenarios. Recent advancements in efficient point cloud transformers have underscored the efficacy of integrating information in sparse formats. In this paper, we conduct a comprehensive exploration of design choices for Transformer-based sparse cameraLiDAR fusion.
arXiv Detail & Related papers (2024-08-13T11:46:32Z)
Attentive Multimodal Fusion for Optical and Scene Flow [24.08052492109655]
Existing methods typically rely solely on RGB images or fuse the modalities at later stages. We propose a novel deep neural network approach named FusionRAFT, which enables early-stage information fusion between sensor modalities. Our approach exhibits improved robustness in the presence of noise and low-lighting conditions that affect the RGB images.
arXiv Detail & Related papers (2023-07-28T04:36:07Z)
mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar [10.610455816814985]
Millimeter Wave (mmWave) Radar is gaining popularity as it can work in adverse environments like smoke, rain, snow, poor lighting, etc. Prior work has explored the possibility of reconstructing 3D skeletons or meshes from the noisy and sparse mmWave Radar signals. This dataset consists of synchronized and calibrated mmWave radar point clouds and RGB(D) images in different scenes and skeleton/mesh annotations for humans in the scenes.
arXiv Detail & Related papers (2022-09-12T08:00:31Z)
Mirror Complementary Transformer Network for RGB-thermal Salient Object Detection [16.64781797503128]
RGB-thermal object detection (RGB-T SOD) aims to locate the common prominent objects of an aligned visible and thermal infrared image pair. In this paper, we propose a novel mirror complementary Transformer network (MCNet) for RGB-T SOD. Experiments on benchmark and VT723 datasets show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-07T20:26:09Z)
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions. TransFusion achieves state-of-the-art performance on large-scale datasets. We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z)
Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors [52.38220261632204]
Flat facial surfaces frequently occur in the PIFu-based reconstruction results. We propose a two-scale PIFu representation to enhance the quality of the reconstructed facial details. Experiments demonstrate the effectiveness of our approach in vivid facial details and deforming body shapes.
arXiv Detail & Related papers (2021-12-03T18:46:49Z)
Transformer-based Network for RGB-D Saliency Detection [82.6665619584628]
Key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities. We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement. Our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.
arXiv Detail & Related papers (2021-12-01T15:53:58Z)
VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion. We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.