DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation
Using Monocular Camera and Sparse LiDAR
- URL: http://arxiv.org/abs/2008.08136v1
- Date: Tue, 18 Aug 2020 19:51:08 GMT
- Title: DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation
Using Monocular Camera and Sparse LiDAR
- Authors: Rishav, Ramy Battrawy, Ren\'e Schuster, Oliver Wasenm\"uller and
Didier Stricker
- Abstract summary: Scene flow is the dense 3D reconstruction of motion and geometry of a scene.
Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction.
DeepLiDARFlow is a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales.
- Score: 10.303618438296981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene flow is the dense 3D reconstruction of motion and geometry of a scene.
Most state-of-the-art methods use a pair of stereo images as input for full
scene reconstruction. These methods depend a lot on the quality of the RGB
images and perform poorly in regions with reflective objects, shadows,
ill-conditioned light environment and so on. LiDAR measurements are much less
sensitive to the aforementioned conditions but LiDAR features are in general
unsuitable for matching tasks due to their sparse nature. Hence, using both
LiDAR and RGB can potentially overcome the individual disadvantages of each
sensor by mutual improvement and yield robust features which can improve the
matching process. In this paper, we present DeepLiDARFlow, a novel deep
learning architecture which fuses high level RGB and LiDAR features at multiple
scales in a monocular setup to predict dense scene flow. Its performance is
much better in the critical regions where image-only and LiDAR-only methods are
inaccurate. We verify our DeepLiDARFlow using the established data sets KITTI
and FlyingThings3D and we show strong robustness compared to several
state-of-the-art methods which used other input modalities. The code of our
paper is available at https://github.com/dfki-av/DeepLiDARFlow.
Related papers
- LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation.
Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
LiDAR simulation plays a crucial role in closed-loop simulation for autonomous driving.
We present LiDAR-GS, the first LiDAR Gaussian Splatting method, for real-time high-fidelity re-simulation of LiDAR sensor scans in public urban road scenes.
Our approach succeeds in simultaneously re-simulating depth, intensity, and ray-drop channels, achieving state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - UltraLiDAR: Learning Compact Representations for LiDAR Completion and
Generation [51.443788294845845]
We present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation.
We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds.
By learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving.
arXiv Detail & Related papers (2023-11-02T17:57:03Z) - MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features [11.28654979274464]
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data.
We introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications.
arXiv Detail & Related papers (2023-06-12T13:01:33Z) - Photometric LiDAR and RGB-D Bundle Adjustment [3.3948742816399697]
This paper presents a novel Bundle Adjustment (BA) photometric strategy that accounts for both RGB-D and LiDAR in the same way.
In addition, we present the benefit of jointly using RGB-D and LiDAR within our unified method.
arXiv Detail & Related papers (2023-03-29T17:35:23Z) - BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework [20.842800465250775]
Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space.
We propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data.
We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings.
arXiv Detail & Related papers (2022-05-27T06:58:30Z) - LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR
Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications.
We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation.
Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth
with RGB Fusion in Challenging Environments [56.306567220448684]
We propose a new learning based end-to-end depth prediction network which takes noisy raw I-ToF signals as well as an RGB image.
We show more than 40% RMSE improvement on the final depth map compared to the baseline approach.
arXiv Detail & Related papers (2021-12-07T15:04:14Z) - RGB-D Local Implicit Function for Depth Completion of Transparent
Objects [43.238923881620494]
Majority of perception methods in robotics require depth information provided by RGB-D cameras.
Standard 3D sensors fail to capture depth of transparent objects due to refraction and absorption of light.
We present a novel framework that can complete missing depth given noisy RGB-D input.
arXiv Detail & Related papers (2021-04-01T17:00:04Z) - Memory-Augmented Reinforcement Learning for Image-Goal Navigation [67.3963444878746]
We present a novel method that leverages a cross-episode memory to learn to navigate.
In order to avoid overfitting, we propose to use data augmentation on the RGB input during training.
We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.
arXiv Detail & Related papers (2021-01-13T16:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.