CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and
Scene Flow Estimation
- URL: http://arxiv.org/abs/2111.10502v1
- Date: Sat, 20 Nov 2021 02:58:38 GMT
- Title: CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and
Scene Flow Estimation
- Authors: Haisong Liu, Tao Lu, Yihui Xu, Jia Liu, Wenjie Li, Lijun Chen
- Abstract summary: We study the problem of jointly estimating the optical flow and scene flow from synchronized 2D and 3D data.
To address the problem, we propose a novel end-to-end framework, called CamLiFlow.
Our method ranks 1st on the KITTI Scene Flow benchmark, outperforming the previous art with 1/7 parameters.
- Score: 15.98323974821097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of jointly estimating the optical flow
and scene flow from synchronized 2D and 3D data. Previous methods either employ
a complex pipeline which splits the joint task into independent stages, or fuse
2D and 3D information in an ``early-fusion'' or ``late-fusion'' manner. Such
one-size-fits-all approaches suffer from a dilemma of failing to fully utilize
the characteristic of each modality or to maximize the inter-modality
complementarity. To address the problem, we propose a novel end-to-end
framework, called CamLiFlow. It consists of 2D and 3D branches with multiple
bidirectional connections between them in specific layers. Different from
previous work, we apply a point-based 3D branch to better extract the geometric
features and design a symmetric learnable operator to fuse dense image features
and sparse point features. We also propose a transformation for point clouds to
solve the non-linear issue of 3D-2D projection. Experiments show that CamLiFlow
achieves better performance with fewer parameters. Our method ranks 1st on the
KITTI Scene Flow benchmark, outperforming the previous art with 1/7 parameters.
Code will be made available.
Related papers
- Occupancy-Based Dual Contouring [12.944046673902415]
We introduce a dual contouring method that provides state-of-the-art performance for occupancy functions.
Our method is learning-free and carefully designed to maximize the use of GPU parallelization.
arXiv Detail & Related papers (2024-09-20T11:32:21Z) - ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point Clouds [52.03819676074455]
ParaPoint is an unsupervised neural learning pipeline for achieving global free-boundary surface parameterization.
This work makes the first attempt to investigate neural point cloud parameterization that pursues both global mappings and free boundaries.
arXiv Detail & Related papers (2024-03-15T14:35:05Z) - Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion [21.421913505496846]
We study the problem of jointly estimating the optical flow and scene flow from synchronized 2D and 3D data.
Previous methods either employ a complex pipeline that splits the joint task into independent stages, or fuse 2D and 3D information in an early-fusion'' or late-fusion'' manner.
We propose a novel end-to-end framework, which consists of 2D and 3D branches with multiple bidirectional fusion connections between them in specific layers.
arXiv Detail & Related papers (2023-03-21T16:54:01Z) - FFPA-Net: Efficient Feature Fusion with Projection Awareness for 3D
Object Detection [19.419030878019974]
unstructured 3D point clouds are filled in the 2D plane and 3D point cloud features are extracted faster using projection-aware convolution layers.
The corresponding indexes between different sensor signals are established in advance in the data preprocessing.
Two new plug-and-play fusion modules, LiCamFuse and BiLiCamFuse, are proposed.
arXiv Detail & Related papers (2022-09-15T16:13:19Z) - What Matters for 3D Scene Flow Network [44.02710380584977]
3D scene flow estimation from point clouds is a low-level 3D motion perception task in computer vision.
We propose a novel all-to-all flow embedding layer with backward reliability validation during the initial scene flow estimation.
Our proposed model surpasses all existing methods by at least 38.2% on FlyingThings3D dataset and 24.7% on KITTI Scene Flow dataset for EPE3D metric.
arXiv Detail & Related papers (2022-07-19T09:27:05Z) - IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding
Alignment [58.8330387551499]
We formulate the problem as estimation of point-wise trajectories (i.e., smooth curves)
We propose IDEA-Net, an end-to-end deep learning framework, which disentangles the problem under the assistance of the explicitly learned temporal consistency.
We demonstrate the effectiveness of our method on various point cloud sequences and observe large improvement over state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-03-22T10:14:08Z) - DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection [83.18142309597984]
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving.
We develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods.
arXiv Detail & Related papers (2022-03-15T18:46:06Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.