Related papers: DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR

DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR

URL: http://arxiv.org/abs/2008.08136v1
Date: Tue, 18 Aug 2020 19:51:08 GMT
Title: DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR
Authors: Rishav, Ramy Battrawy, Ren\'e Schuster, Oliver Wasenm\"uller and Didier Stricker
Abstract summary: Scene flow is the dense 3D reconstruction of motion and geometry of a scene. Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction. DeepLiDARFlow is a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales.
Score: 10.303618438296981
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scene flow is the dense 3D reconstruction of motion and geometry of a scene. Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction. These methods depend a lot on the quality of the RGB images and perform poorly in regions with reflective objects, shadows, ill-conditioned light environment and so on. LiDAR measurements are much less sensitive to the aforementioned conditions but LiDAR features are in general unsuitable for matching tasks due to their sparse nature. Hence, using both LiDAR and RGB can potentially overcome the individual disadvantages of each sensor by mutual improvement and yield robust features which can improve the matching process. In this paper, we present DeepLiDARFlow, a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales in a monocular setup to predict dense scene flow. Its performance is much better in the critical regions where image-only and LiDAR-only methods are inaccurate. We verify our DeepLiDARFlow using the established data sets KITTI and FlyingThings3D and we show strong robustness compared to several state-of-the-art methods which used other input modalities. The code of our paper is available at https://github.com/dfki-av/DeepLiDARFlow.

Related papers

DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications. This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors. Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z)
Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB [12.38882701862349]
3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning. RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes. We propose using an alternative class of "blurred" LiDAR that emits a diffuse flash.
arXiv Detail & Related papers (2024-11-29T05:01:23Z)
LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation. Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z)
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
LiDAR simulation plays a crucial role in closed-loop simulation for autonomous driving. We present LiDAR-GS, the first LiDAR Gaussian Splatting method, for real-time high-fidelity re-simulation of LiDAR sensor scans in public urban road scenes. Our approach succeeds in simultaneously re-simulating depth, intensity, and ray-drop channels, achieving state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z)
UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation [51.443788294845845]
We present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation. We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds. By learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving.
arXiv Detail & Related papers (2023-11-02T17:57:03Z)
MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features [11.28654979274464]
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data. We introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications.
arXiv Detail & Related papers (2023-06-12T13:01:33Z)
Photometric LiDAR and RGB-D Bundle Adjustment [3.3948742816399697]
This paper presents a novel Bundle Adjustment (BA) photometric strategy that accounts for both RGB-D and LiDAR in the same way. In addition, we present the benefit of jointly using RGB-D and LiDAR within our unified method.
arXiv Detail & Related papers (2023-03-29T17:35:23Z)
DARF: Depth-Aware Generalizable Neural Radiance Field [51.29437249009986]
We propose the Depth-Aware Generalizable Neural Radiance Field (DARF) with a Depth-Aware Dynamic Sampling (DADS) strategy. Our framework infers the unseen scenes on both pixel level and geometry level with only a few input images. Compared with state-of-the-art generalizable NeRF methods, DARF reduces samples by 50%, while improving rendering quality and depth estimation.
arXiv Detail & Related papers (2022-12-05T14:00:59Z)
BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework [20.842800465250775]
Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space. We propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data. We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings.
arXiv Detail & Related papers (2022-05-27T06:58:30Z)
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications. We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation. Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z)
Consistent Depth Prediction under Various Illuminations using Dilated Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps. We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations. Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z)
Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth with RGB Fusion in Challenging Environments [56.306567220448684]
We propose a new learning based end-to-end depth prediction network which takes noisy raw I-ToF signals as well as an RGB image. We show more than 40% RMSE improvement on the final depth map compared to the baseline approach.
arXiv Detail & Related papers (2021-12-07T15:04:14Z)
RGB-D Local Implicit Function for Depth Completion of Transparent Objects [43.238923881620494]
Majority of perception methods in robotics require depth information provided by RGB-D cameras. Standard 3D sensors fail to capture depth of transparent objects due to refraction and absorption of light. We present a novel framework that can complete missing depth given noisy RGB-D input.
arXiv Detail & Related papers (2021-04-01T17:00:04Z)
Memory-Augmented Reinforcement Learning for Image-Goal Navigation [67.3963444878746]
We present a novel method that leverages a cross-episode memory to learn to navigate. In order to avoid overfitting, we propose to use data augmentation on the RGB input during training. We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.
arXiv Detail & Related papers (2021-01-13T16:30:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.