Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple
Masks
- URL: http://arxiv.org/abs/2104.00431v1
- Date: Thu, 1 Apr 2021 12:29:23 GMT
- Title: Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple
Masks
- Authors: Guangming Wang, Hesheng Wang, Yiling Liu and Weidong Chen
- Abstract summary: A new unsupervised learning method of depth and ego-motion using multiple masks from monocular video is proposed in this paper.
The depth estimation network and the ego-motion estimation network are trained according to the constraints of depth and ego-motion without truth values.
The experiments on KITTI dataset show our method achieves good performance in terms of depth and ego-motion.
- Score: 14.82498499423046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A new unsupervised learning method of depth and ego-motion using multiple
masks from monocular video is proposed in this paper. The depth estimation
network and the ego-motion estimation network are trained according to the
constraints of depth and ego-motion without truth values. The main contribution
of our method is to carefully consider the occlusion of the pixels generated
when the adjacent frames are projected to each other, and the blank problem
generated in the projection target imaging plane. Two fine masks are designed
to solve most of the image pixel mismatch caused by the movement of the camera.
In addition, some relatively rare circumstances are considered, and repeated
masking is proposed. To some extent, the method is to use a geometric
relationship to filter the mismatched pixels for training, making unsupervised
learning more efficient and accurate. The experiments on KITTI dataset show our
method achieves good performance in terms of depth and ego-motion. The
generalization capability of our method is demonstrated by training on the
low-quality uncalibrated bike video dataset and evaluating on KITTI dataset,
and the results are still good.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Learning depth from monocular video sequences [0.0]
We propose a novel training loss which enables us to include more images for supervision during the training process.
We also design a novel network architecture for single image estimation.
arXiv Detail & Related papers (2023-10-26T05:00:41Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - Unsupervised Monocular Depth Perception: Focusing on Moving Objects [5.489557739480878]
In this paper, we show that deliberately manipulating photometric errors can efficiently deal with difficulties better.
We first propose an outlier masking technique that considers the occluded or dynamic pixels as statistical outliers in the photometric error map.
With the outlier masking, the network learns the depth of objects that move in the opposite direction to the camera more accurately.
arXiv Detail & Related papers (2021-08-30T08:45:02Z) - Stereo Matching by Self-supervision of Multiscopic Vision [65.38359887232025]
We propose a new self-supervised framework for stereo matching utilizing multiple images captured at aligned camera positions.
A cross photometric loss, an uncertainty-aware mutual-supervision loss, and a new smoothness loss are introduced to optimize the network.
Our model obtains better disparity maps than previous unsupervised methods on the KITTI dataset.
arXiv Detail & Related papers (2021-04-09T02:58:59Z) - DiPE: Deeper into Photometric Errors for Unsupervised Learning of Depth
and Ego-motion from Monocular Videos [9.255509741319583]
This paper shows that carefully manipulating photometric errors can tackle these difficulties better.
The primary improvement is achieved by a statistical technique that can mask out the invisible or nonstationary pixels in the photometric error map.
We also propose an efficient weighted multi-scale scheme to reduce the artifacts in the predicted depth maps.
arXiv Detail & Related papers (2020-03-03T07:05:15Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.