The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
- URL: http://arxiv.org/abs/2104.14540v1
- Date: Thu, 29 Apr 2021 17:53:42 GMT
- Title: The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
- Authors: Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel Brostow,
Michael Firman
- Abstract summary: ManyDepth is an adaptive approach to dense depth estimation.
We present a novel consistency loss that encourages the network to ignore the cost volume when it is deemed unreliable.
- Score: 28.06671063873351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised monocular depth estimation networks are trained to predict
scene depth using nearby frames as a supervision signal during training.
However, for many applications, sequence information in the form of video
frames is also available at test time. The vast majority of monocular networks
do not make use of this extra signal, thus ignoring valuable information that
could be used to improve the predicted depth. Those that do, either use
computationally expensive test-time refinement techniques or off-the-shelf
recurrent networks, which only indirectly make use of the geometric information
that is inherently available.
We propose ManyDepth, an adaptive approach to dense depth estimation that can
make use of sequence information at test time, when it is available. Taking
inspiration from multi-view stereo, we propose a deep end-to-end cost volume
based approach that is trained using self-supervision only. We present a novel
consistency loss that encourages the network to ignore the cost volume when it
is deemed unreliable, e.g. in the case of moving objects, and an augmentation
scheme to cope with static cameras. Our detailed experiments on both KITTI and
Cityscapes show that we outperform all published self-supervised baselines,
including those that use single or multiple frames at test time.
Related papers
- FusionDepth: Complement Self-Supervised Monocular Depth Estimation with
Cost Volume [9.912304015239313]
We propose a multi-frame depth estimation framework which monocular depth can be refined continuously by multi-frame sequential constraints.
Our method also enhances the interpretability when combining monocular estimation with multi-view cost volume.
arXiv Detail & Related papers (2023-05-10T10:38:38Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - Reversing the cycle: self-supervised deep stereo through enhanced
monocular distillation [51.714092199995044]
In many fields, self-supervised learning solutions are rapidly evolving and filling the gap with supervised approaches.
We propose a novel self-supervised paradigm reversing the link between the two.
In order to train deep stereo networks, we distill knowledge through a monocular completion network.
arXiv Detail & Related papers (2020-08-17T07:40:22Z) - MiniNet: An extremely lightweight convolutional neural network for
real-time unsupervised monocular depth estimation [22.495019810166397]
We propose a new powerful network with a recurrent module to achieve the capability of a deep network.
We maintain an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences.
Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3.
arXiv Detail & Related papers (2020-06-27T12:13:22Z) - Self-Supervised Joint Learning Framework of Depth Estimation via
Implicit Cues [24.743099160992937]
We propose a novel self-supervised joint learning framework for depth estimation.
The proposed framework outperforms the state-of-the-art(SOTA) on KITTI and Make3D datasets.
arXiv Detail & Related papers (2020-06-17T13:56:59Z) - Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [92.84498980104424]
We put three different types of depth estimation into a common framework.
Our method produces a time series of depth maps.
It can be applied to monocular videos only or be combined with different types of sparse depth patterns.
arXiv Detail & Related papers (2020-01-08T16:50:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.