DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised
Representation Learning
- URL: http://arxiv.org/abs/2003.13446v1
- Date: Mon, 30 Mar 2020 13:10:32 GMT
- Title: DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised
Representation Learning
- Authors: Jaime Spencer, Richard Bowden, Simon Hadfield
- Abstract summary: DeFeat-Net is an approach to simultaneously learn a cross-domain dense feature representation.
Our technique is able to outperform the current state-of-the-art with around 10% reduction in all error measures.
- Score: 65.94499390875046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the current monocular depth research, the dominant approach is to employ
unsupervised training on large datasets, driven by warped photometric
consistency. Such approaches lack robustness and are unable to generalize to
challenging domains such as nighttime scenes or adverse weather conditions
where assumptions about photometric consistency break down.
We propose DeFeat-Net (Depth & Feature network), an approach to
simultaneously learn a cross-domain dense feature representation, alongside a
robust depth-estimation framework based on warped feature consistency. The
resulting feature representation is learned in an unsupervised manner with no
explicit ground-truth correspondences required.
We show that within a single domain, our technique is comparable to both the
current state of the art in monocular depth estimation and supervised feature
representation learning. However, by simultaneously learning features, depth
and motion, our technique is able to generalize to challenging domains,
allowing DeFeat-Net to outperform the current state-of-the-art with around 10%
reduction in all error measures on more challenging sequences such as nighttime
driving.
Related papers
- Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging
Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework.
We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas.
With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z) - Unearthing Common Inconsistency for Generalisable Deepfake Detection [8.327980745153216]
Video-level one shows its potential to have both generalization across multiple domains and robustness to compression.
We propose a detection approach by capturing frame inconsistency that broadly exists in different forgery techniques.
We introduce a temporally-preserved module method to introduce spatial noise perturbations, directing the model's attention towards temporal information.
arXiv Detail & Related papers (2023-11-20T06:04:09Z) - Fine-grained Semantics-aware Representation Enhancement for
Self-supervised Monocular Depth Estimation [16.092527463250708]
We propose novel ideas to improve self-supervised monocular depth estimation.
We focus on incorporating implicit semantic knowledge into geometric representation enhancement.
We evaluate our methods on the KITTI dataset and demonstrate that our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-19T17:50:51Z) - Causal Navigation by Continuous-time Neural Networks [108.84958284162857]
We propose a theoretical and experimental framework for learning causal representations using continuous-time neural networks.
We evaluate our method in the context of visual-control learning of drones over a series of complex tasks.
arXiv Detail & Related papers (2021-06-15T17:45:32Z) - Learning a Domain-Agnostic Visual Representation for Autonomous Driving
via Contrastive Loss [25.798361683744684]
Domain-Agnostic Contrastive Learning (DACL) is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss.
Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-10T07:06:03Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware
Feature Extraction [27.750031877854717]
We propose SAFENet that is designed to leverage semantic information to overcome the limitations of the photometric loss.
Our key idea is to exploit semantic-aware depth features that integrate the semantic and geometric knowledge.
Experiments on KITTI dataset demonstrate that our methods compete or even outperform the state-of-the-art methods.
arXiv Detail & Related papers (2020-10-06T17:22:25Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - DeepCap: Monocular Human Performance Capture Using Weak Supervision [106.50649929342576]
We propose a novel deep learning approach for monocular dense human performance capture.
Our method is trained in a weakly supervised manner based on multi-view supervision.
Our approach outperforms the state of the art in terms of quality and robustness.
arXiv Detail & Related papers (2020-03-18T16:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.