BEVScope: Enhancing Self-Supervised Depth Estimation Leveraging
Bird's-Eye-View in Dynamic Scenarios
- URL: http://arxiv.org/abs/2306.11598v1
- Date: Tue, 20 Jun 2023 15:16:35 GMT
- Title: BEVScope: Enhancing Self-Supervised Depth Estimation Leveraging
Bird's-Eye-View in Dynamic Scenarios
- Authors: Yucheng Mao, Ruowen Zhao, Tianbao Zhang and Hang Zhao
- Abstract summary: Current self-supervised depth estimation methods grapple with several limitations.
We present BEVScope, an innovative approach to self-supervised depth estimation.
We propose an adaptive loss function, specifically designed to mitigate the complexities associated with moving objects.
- Score: 12.079195812249747
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Depth estimation is a cornerstone of perception in autonomous driving and
robotic systems. The considerable cost and relatively sparse data acquisition
of LiDAR systems have led to the exploration of cost-effective alternatives,
notably, self-supervised depth estimation. Nevertheless, current
self-supervised depth estimation methods grapple with several limitations: (1)
the failure to adequately leverage informative multi-camera views. (2) the
limited capacity to handle dynamic objects effectively. To address these
challenges, we present BEVScope, an innovative approach to self-supervised
depth estimation that harnesses Bird's-Eye-View (BEV) features. Concurrently,
we propose an adaptive loss function, specifically designed to mitigate the
complexities associated with moving objects. Empirical evaluations conducted on
the Nuscenes dataset validate our approach, demonstrating competitive
performance. Code will be released at https://github.com/myc634/BEVScope.
Related papers
- D$^3$epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes [23.731667977542454]
D$3$epth is a novel method for self-supervised depth estimation in dynamic scenes.
It tackles the challenge of dynamic objects from two key perspectives.
It consistently outperforms existing self-supervised monocular depth estimation baselines.
arXiv Detail & Related papers (2024-11-07T16:07:00Z) - A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts [6.260553883409459]
This paper introduces a novel dataset and evaluation methodology to quantify the impact of different camera positions and orientations on monocular depth estimation performance.
We propose a ground truth strategy based on homography estimation and object detection, eliminating the need for expensive lidar sensors.
arXiv Detail & Related papers (2024-09-26T13:57:05Z) - OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving.
Existing approaches in this field often assume precise and complete observational data.
We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z) - Manydepth2: Motion-Aware Self-Supervised Multi-Frame Monocular Depth Estimation in Dynamic Scenes [45.092076587934464]
We present Manydepth2, to achieve precise depth estimation for both dynamic objects and static backgrounds.
To tackle the challenges posed by dynamic content, we incorporate optical flow and coarse monocular depth to create a pseudo-static reference frame.
This frame is then utilized to build a motion-aware cost volume in collaboration with the vanilla target frame.
arXiv Detail & Related papers (2023-12-23T14:36:27Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - Long Range Object-Level Monocular Depth Estimation for UAVs [0.0]
We propose several novel extensions to state-of-the-art methods for monocular object detection from images at long range.
Firstly, we propose Sigmoid and ReLU-like encodings when modeling depth estimation as a regression task.
Secondly, we frame the depth estimation as a classification problem and introduce a Soft-Argmax function in the calculation of the training loss.
arXiv Detail & Related papers (2023-02-17T15:26:04Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation [88.8963330073454]
We propose a novel monocular 6D pose estimation approach by means of self-supervised learning.
We leverage current trends in noisy student training and differentiable rendering to further self-supervise the model.
Our proposed self-supervision outperforms all other methods relying on synthetic data.
arXiv Detail & Related papers (2022-03-19T15:12:06Z) - R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of
Dynamic Scenes [69.6715406227469]
Self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches.
We present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework.
arXiv Detail & Related papers (2021-08-10T17:57:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.