BiFuse++: Self-supervised and Efficient Bi-projection Fusion for 360
Depth Estimation
- URL: http://arxiv.org/abs/2209.02952v1
- Date: Wed, 7 Sep 2022 06:24:21 GMT
- Title: BiFuse++: Self-supervised and Efficient Bi-projection Fusion for 360
Depth Estimation
- Authors: Fu-En Wang, Yu-Hsuan Yeh, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
- Abstract summary: We propose BiFuse++ to explore the combination of bi-projection fusion and the self-training scenario.
We propose a new fusion module and Contrast-Aware Photometric Loss to improve the performance of BiFuse.
- Score: 59.11106101006008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the rise of spherical cameras, monocular 360 depth estimation becomes
an important technique for many applications (e.g., autonomous systems). Thus,
state-of-the-art frameworks for monocular 360 depth estimation such as
bi-projection fusion in BiFuse are proposed. To train such a framework, a large
number of panoramas along with the corresponding depth ground truths captured
by laser sensors are required, which highly increases the cost of data
collection. Moreover, since such a data collection procedure is time-consuming,
the scalability of extending these methods to different scenes becomes a
challenge. To this end, self-training a network for monocular depth estimation
from 360 videos is one way to alleviate this issue. However, there are no
existing frameworks that incorporate bi-projection fusion into the
self-training scheme, which highly limits the self-supervised performance since
bi-projection fusion can leverage information from different projection types.
In this paper, we propose BiFuse++ to explore the combination of bi-projection
fusion and the self-training scenario. To be specific, we propose a new fusion
module and Contrast-Aware Photometric Loss to improve the performance of BiFuse
and increase the stability of self-training on real-world videos. We conduct
both supervised and self-supervised experiments on benchmark datasets and
achieve state-of-the-art performance.
Related papers
- Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation [6.832852988957967]
We propose a new depth estimation framework that utilizes unlabeled 360-degree data effectively.
Our approach uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels.
We tested our approach on benchmark datasets such as Matterport3D and Stanford2D3D, showing significant improvements in depth estimation accuracy.
arXiv Detail & Related papers (2024-06-18T17:59:31Z) - Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers [39.14931758754381]
We introduce a novel fusion method that bypasses monocular depth estimation altogether.
We show that our model can modulate its use of camera features based on the availability of lidar features.
arXiv Detail & Related papers (2023-12-22T18:51:50Z) - Robust Self-Supervised Extrinsic Self-Calibration [25.727912226753247]
Multi-camera self-supervised monocular depth estimation from videos is a promising way to reason about the environment.
We introduce a novel method for extrinsic calibration that builds upon the principles of self-supervised monocular depth and ego-motion learning.
arXiv Detail & Related papers (2023-08-04T06:20:20Z) - EGA-Depth: Efficient Guided Attention for Self-Supervised Multi-Camera
Depth Estimation [45.59727643007449]
We propose a novel guided attention architecture, EGA-Depth, which can improve the efficiency and accuracy of self-supervised multi-camera depth estimation.
For each camera, we use its perspective view as the query to cross-reference its neighboring views to derive informative features for this camera view.
arXiv Detail & Related papers (2023-04-06T20:50:28Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View
Representation [116.6111047218081]
We introduce BEVFusion, a generic multi-task multi-sensor fusion framework.
It unifies multi-modal features in the shared bird's-eye view representation space.
It achieves 1.3% higher mAP and NDS on 3D object detection and 13.6% higher mIoU on BEV map segmentation, with 1.9x lower cost.
arXiv Detail & Related papers (2022-05-26T17:59:35Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.