Self-Supervised Monocular Depth Estimation: Solving the Edge-Fattening
  Problem
        - URL: http://arxiv.org/abs/2210.00411v2
- Date: Tue, 4 Oct 2022 03:44:38 GMT
- Title: Self-Supervised Monocular Depth Estimation: Solving the Edge-Fattening
  Problem
- Authors: Xingyu Chen, Ruonan Zhang, Ji Jiang, Yan Wang, Ge Li, Thomas H. Li
- Abstract summary: Triplet loss, popular for metric learning, has made a great success in many computer vision tasks.
We show two drawbacks of the raw triplet loss in MDE and demonstrate our problem-driven redesigns.
- Score: 39.82550656611876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Self-supervised monocular depth estimation (MDE) models universally suffer
from the notorious edge-fattening issue. Triplet loss, popular for metric
learning, has made a great success in many computer vision tasks. In this
paper, we redesign the patch-based triplet loss in MDE to alleviate the
ubiquitous edge-fattening issue. We show two drawbacks of the raw triplet loss
in MDE and demonstrate our problem-driven redesigns. First, we present a min.
operator based strategy applied to all negative samples, to prevent
well-performing negatives sheltering the error of edge-fattening negatives.
Second, we split the anchor-positive distance and anchor-negative distance from
within the original triplet, which directly optimizes the positives without any
mutual effect with the negatives. Extensive experiments show the combination of
these two small redesigns can achieve unprecedented results: Our powerful and
versatile triplet loss not only makes our model outperform all previous SoTA by
a large margin, but also provides substantial performance boosts to a large
number of existing models, while introducing no extra inference computation at
all.
 
      
        Related papers
        - UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler [62.06785782635153]
 We propose a new model, UniDepthV2, capable of reconstructing metric 3D scenes from solely single images across domains.
UniDepthV2 directly predicts metric 3D points from the input image at inference time without any additional information.
Our model exploits a pseudo-spherical output representation, which disentangles the camera and depth representations.
 arXiv  Detail & Related papers  (2025-02-27T14:03:15Z)
- PHUDGE: Phi-3 as Scalable Judge [1.7495213911983414]
 We present Phi3 model that achieved SOTA results in 4 tasks as Feedback Test, Feedback OOD, MT Human, Preference Test.
It shows very strong correlation not only with GPT4 but with Human annotators too in unseen data as well as in both absolute and relative grading tasks.
We show that by following systematic ML experimentation, thoughtful data augmentation and reposing the problem itself, we can even beat 10x bigger models even with lesser training data.
 arXiv  Detail & Related papers  (2024-05-12T18:22:16Z)
- Collapse-Aware Triplet Decoupling for Adversarially Robust Image   Retrieval [12.007316506425079]
 Adversarial training has achieved substantial performance in defending image retrieval against adversarial examples.
Existing studies in deep metric learning (DML) still suffer from two major limitations: weak adversary and model collapse.
We propose Collapse-Aware TRIplet DEcoupling (CA-TRIDE) to yield a stronger adversary.
 arXiv  Detail & Related papers  (2023-12-12T15:33:08Z)
- When hard negative sampling meets supervised contrastive learning [17.173114048398947]
 We introduce a new supervised contrastive learning objective, SCHaNe, which incorporates hard negative sampling during the fine-tuning phase.
SchaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks.
Our proposed objective sets a new state-of-the-art for base models on ImageNet-1k, achieving an 86.14% accuracy.
 arXiv  Detail & Related papers  (2023-08-28T20:30:10Z)
- Towards Regression-Free Neural Networks for Diverse Compute Platforms [50.64489250972764]
 We introduce REGression constrained Neural Architecture Search (REG-NAS) to design a family of highly accurate models that engender fewer negative flips.
REG-NAS consists of two components: (1) A novel architecture constraint that enables a larger model to contain all the weights of the smaller one thus maximizing weight sharing.
We demonstrate that regnas can successfully find desirable architectures with few negative flips in three popular architecture search spaces.
 arXiv  Detail & Related papers  (2022-09-27T23:19:16Z)
- Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
  Estimation [70.32536356351706]
 We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
 arXiv  Detail & Related papers  (2022-03-29T07:14:58Z)
- Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
 This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
 arXiv  Detail & Related papers  (2021-08-26T17:55:11Z)
- Are Negative Samples Necessary in Entity Alignment? An Approach with
  High Performance, Scalability and Robustness [26.04006507181558]
 We propose a novel EA method with three new components to enable high Performance, high Scalability, and high Robustness.
We conduct detailed experiments on several public datasets to examine the effectiveness and efficiency of our proposed method.
 arXiv  Detail & Related papers  (2021-08-11T15:20:41Z)
- Trip-ROMA: Self-Supervised Learning with Triplets and Random Mappings [59.32440962369532]
 We show that a simple Triplet-based loss can achieve surprisingly good performance without requiring large batches or asymmetry designs.
To alleviate the over-fitting problem in small data regimes, we propose a simple plug-and-play RandOm MApping (ROMA) strategy.
 arXiv  Detail & Related papers  (2021-07-22T02:06:38Z)
- Deep Ranking with Adaptive Margin Triplet Loss [5.220120772989114]
 We propose a simple modification from a fixed margin triplet loss to an adaptive margin triplet loss.
Our proposed loss is well suited for rating datasets in which the ratings are continuous values.
 arXiv  Detail & Related papers  (2021-07-13T15:37:20Z)
- Weakly Supervised Generative Network for Multiple 3D Human Pose
  Hypotheses [74.48263583706712]
 3D human pose estimation from a single image is an inverse problem due to the inherent ambiguity of the missing depth.
We propose a weakly supervised deep generative network to address the inverse problem.
 arXiv  Detail & Related papers  (2020-08-13T09:26:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.