Masked GANs for Unsupervised Depth and Pose Prediction with Scale
Consistency
- URL: http://arxiv.org/abs/2004.04345v3
- Date: Tue, 13 Apr 2021 14:05:30 GMT
- Title: Masked GANs for Unsupervised Depth and Pose Prediction with Scale
Consistency
- Authors: Chaoqiang Zhao, Gary G. Yen, Qiyu Sun, Chongzhen Zhang and Yang Tang
- Abstract summary: This paper proposes a masked generative adversarial network (GAN) for unsupervised monocular depth and ego-motion estimation.
The MaskNet and Boolean mask scheme are designed in this framework to eliminate the effects of occlusions and impacts of visual field changes on the reconstruction loss and adversarial loss.
- Score: 18.10657948047875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work has shown that adversarial learning can be used for
unsupervised monocular depth and visual odometry (VO) estimation, in which the
adversarial loss and the geometric image reconstruction loss are utilized as
the mainly supervisory signals to train the whole unsupervised framework.
However, the performance of the adversarial framework and image reconstruction
is usually limited by occlusions and the visual field changes between frames.
This paper proposes a masked generative adversarial network (GAN) for
unsupervised monocular depth and ego-motion estimation.The MaskNet and Boolean
mask scheme are designed in this framework to eliminate the effects of
occlusions and impacts of visual field changes on the reconstruction loss and
adversarial loss, respectively. Furthermore, we also consider the scale
consistency of our pose network by utilizing a new scale-consistency loss, and
therefore, our pose network is capable of providing the full camera trajectory
over a long monocular sequence. Extensive experiments on the KITTI dataset show
that each component proposed in this paper contributes to the performance, and
both our depth and trajectory predictions achieve competitive performance on
the KITTI and Make3D datasets.
Related papers
- Towards Evaluating the Robustness of Visual State Space Models [63.14954591606638]
Vision State Space Models (VSSMs) have demonstrated remarkable performance in visual perception tasks.
However, their robustness under natural and adversarial perturbations remains a critical concern.
We present a comprehensive evaluation of VSSMs' robustness under various perturbation scenarios.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - W-Net: A Facial Feature-Guided Face Super-Resolution Network [8.037821981254389]
Face Super-Resolution aims to recover high-resolution (HR) face images from low-resolution (LR) ones.
Existing approaches are not ideal due to their low reconstruction efficiency and insufficient utilization of prior information.
This paper proposes a novel network architecture called W-Net to address this challenge.
arXiv Detail & Related papers (2024-06-02T09:05:40Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation [17.572459787107427]
We propose a flow distillation loss to replace the typical photometric loss and a prior flow based mask to remove invalid pixels.
Our approach achieves state-of-the-art results on both KITTI and NYU-Depth-v2 datasets.
arXiv Detail & Related papers (2023-01-20T04:02:13Z) - Adversarial Attacks on Monocular Pose Estimation [13.7258515433446]
We study the relation between adversarial perturbations targeting monocular depth and pose estimation networks.
Our experiments show how the generated perturbations lead to notable errors in relative rotation and translation predictions.
arXiv Detail & Related papers (2022-07-14T16:12:31Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z) - FIS-Nets: Full-image Supervised Networks for Monocular Depth Estimation [14.454378082294852]
We propose a semi-supervised architecture, which combines both unsupervised framework of using image consistency and supervised framework of dense depth completion.
In the evaluation, we show that our proposed model outperforms other approaches on depth estimation.
arXiv Detail & Related papers (2020-01-19T06:04:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.