DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation
Using Monocular Camera and Sparse LiDAR
- URL: http://arxiv.org/abs/2008.08136v1
- Date: Tue, 18 Aug 2020 19:51:08 GMT
- Title: DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation
Using Monocular Camera and Sparse LiDAR
- Authors: Rishav, Ramy Battrawy, Ren\'e Schuster, Oliver Wasenm\"uller and
Didier Stricker
- Abstract summary: Scene flow is the dense 3D reconstruction of motion and geometry of a scene.
Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction.
DeepLiDARFlow is a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales.
- Score: 10.303618438296981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene flow is the dense 3D reconstruction of motion and geometry of a scene.
Most state-of-the-art methods use a pair of stereo images as input for full
scene reconstruction. These methods depend a lot on the quality of the RGB
images and perform poorly in regions with reflective objects, shadows,
ill-conditioned light environment and so on. LiDAR measurements are much less
sensitive to the aforementioned conditions but LiDAR features are in general
unsuitable for matching tasks due to their sparse nature. Hence, using both
LiDAR and RGB can potentially overcome the individual disadvantages of each
sensor by mutual improvement and yield robust features which can improve the
matching process. In this paper, we present DeepLiDARFlow, a novel deep
learning architecture which fuses high level RGB and LiDAR features at multiple
scales in a monocular setup to predict dense scene flow. Its performance is
much better in the critical regions where image-only and LiDAR-only methods are
inaccurate. We verify our DeepLiDARFlow using the established data sets KITTI
and FlyingThings3D and we show strong robustness compared to several
state-of-the-art methods which used other input modalities. The code of our
paper is available at https://github.com/dfki-av/DeepLiDARFlow.
Related papers
- DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.
This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.
Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB [12.38882701862349]
3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning.
RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes.
We propose using an alternative class of "blurred" LiDAR that emits a diffuse flash.
arXiv Detail & Related papers (2024-11-29T05:01:23Z) - LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation.
Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
LiDAR simulation plays a crucial role in closed-loop simulation for autonomous driving.
We present LiDAR-GS, the first LiDAR Gaussian Splatting method, for real-time high-fidelity re-simulation of LiDAR sensor scans in public urban road scenes.
Our approach succeeds in simultaneously re-simulating depth, intensity, and ray-drop channels, achieving state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - UltraLiDAR: Learning Compact Representations for LiDAR Completion and
Generation [51.443788294845845]
We present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation.
We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds.
By learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving.
arXiv Detail & Related papers (2023-11-02T17:57:03Z) - MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features [11.28654979274464]
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data.
We introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications.
arXiv Detail & Related papers (2023-06-12T13:01:33Z) - Photometric LiDAR and RGB-D Bundle Adjustment [3.3948742816399697]
This paper presents a novel Bundle Adjustment (BA) photometric strategy that accounts for both RGB-D and LiDAR in the same way.
In addition, we present the benefit of jointly using RGB-D and LiDAR within our unified method.
arXiv Detail & Related papers (2023-03-29T17:35:23Z) - DARF: Depth-Aware Generalizable Neural Radiance Field [51.29437249009986]
We propose the Depth-Aware Generalizable Neural Radiance Field (DARF) with a Depth-Aware Dynamic Sampling (DADS) strategy.
Our framework infers the unseen scenes on both pixel level and geometry level with only a few input images.
Compared with state-of-the-art generalizable NeRF methods, DARF reduces samples by 50%, while improving rendering quality and depth estimation.
arXiv Detail & Related papers (2022-12-05T14:00:59Z) - BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework [20.842800465250775]
Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space.
We propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data.
We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings.
arXiv Detail & Related papers (2022-05-27T06:58:30Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Memory-Augmented Reinforcement Learning for Image-Goal Navigation [67.3963444878746]
We present a novel method that leverages a cross-episode memory to learn to navigate.
In order to avoid overfitting, we propose to use data augmentation on the RGB input during training.
We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.
arXiv Detail & Related papers (2021-01-13T16:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.