Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation
- URL: http://arxiv.org/abs/2506.04758v1
- Date: Thu, 05 Jun 2025 08:43:24 GMT
- Title: Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation
- Authors: Yijun Cao, Fuya Luo, Yongjie Li,
- Abstract summary: This work proposes a new form of structure similarity index measure (SSIM)<n>Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM.<n>The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation.
- Score: 14.89929051723735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised monocular depth learning generally relies on the photometric relation among temporally adjacent images. Most of previous works use both mean absolute error (MAE) and structure similarity index measure (SSIM) with conventional form as training loss. However, they ignore the effect of different components in the SSIM function and the corresponding hyperparameters on the training. To address these issues, this work proposes a new form of SSIM. Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM. The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation. We conduct extensive experiments to determine the relatively optimal combination of parameters for our new SSIM. Based on the popular MonoDepth approach, the optimized SSIM loss function can remarkably outperform the baseline on the KITTI-2015 outdoor dataset.
Related papers
- Towards Generating Realistic Underwater Images [0.0]
We investigate the performance of image translation models for generating realistic underwater images using the VAROS dataset.<n>For paired image translation, pix2pix achieves the best FID scores due to its paired supervision and PatchGAN discriminator.<n>For unpaired methods, CycleGAN achieves a competitive FID score by leveraging cycle-consistency loss, whereas CUT, which replaces cycle-consistency with contrastive learning, attains higher SSIM.
arXiv Detail & Related papers (2025-05-20T12:44:19Z) - SimCE: Simplifying Cross-Entropy Loss for Collaborative Filtering [47.81610130269399]
We propose a Sampled Softmax Cross-Entropy (SSM) that compares one positive sample with multiple negative samples, leading to better performance.
We also introduce a underlineSimplified Sampled Softmax underlineCross-underlineEntropy Loss (SimCE) which simplifies the SSM using its upper bound.
Our validation on 12 benchmark datasets, using both MF and LightGCN backbones, shows that SimCE significantly outperforms both BPR and SSM.
arXiv Detail & Related papers (2024-06-23T17:24:07Z) - Bridging the Sim-to-Real Gap with Bayesian Inference [53.61496586090384]
We present SIM-FSVGD for learning robot dynamics from data.
We use low-fidelity physical priors to regularize the training of neural network models.
We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system.
arXiv Detail & Related papers (2024-03-25T11:29:32Z) - Neural Posterior Estimation with Differentiable Simulators [58.720142291102135]
We present a new method to perform Neural Posterior Estimation (NPE) with a differentiable simulator.
We demonstrate how gradient information helps constrain the shape of the posterior and improves sample-efficiency.
arXiv Detail & Related papers (2022-07-12T16:08:04Z) - DSSIM: a structural similarity index for floating-point data [68.8204255655161]
We propose an alternative to the popular SSIM that can be applied directly to the floating point data, which we refer to as the Data SSIM (DSSIM)
While we demonstrate the usefulness of the DSSIM in the context of evaluating differences due to lossy compression on large volumes of simulation data, the DSSIM may prove useful for many other applications involving simulation or image data.
arXiv Detail & Related papers (2022-02-05T19:18:33Z) - A Hitchhiker's Guide to Structural Similarity [40.567747702628076]
The Structural Similarity (SSIM) Index is a very widely used image/video quality model.
We studied and compared the functions and performances of popular and widely used implementations of SSIM.
We have arrived at a collection of recommendations on how to use SSIM most effectively.
arXiv Detail & Related papers (2021-01-16T02:51:06Z) - Sinkhorn Natural Gradient for Generative Models [125.89871274202439]
We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
arXiv Detail & Related papers (2020-11-09T02:51:17Z) - Revisiting Robust Model Fitting Using Truncated Loss [19.137291311347788]
New algorithms are applied to various 2D/3D registration problems.
They outperform RANSAC and approximate approximate MC methods at high outlier ratios.
New algorithms also compare favorably with state-of-the-art registration methods, especially in high noise and outliers.
arXiv Detail & Related papers (2020-08-04T14:10:41Z) - ML-SIM: A deep neural network for reconstruction of structured
illumination microscopy images [0.0]
Structured illumination microscopy (SIM) has become an important technique for optical super-resolution imaging.
Here we propose a versatile reconstruction method, ML-SIM, which makes use of machine learning.
ML-SIM is thus robust to noise and irregularities in the illumination patterns of the raw SIM input frames.
arXiv Detail & Related papers (2020-03-24T18:42:23Z) - Augmented Parallel-Pyramid Net for Attention Guided Pose-Estimation [90.28365183660438]
This paper proposes an augmented parallel-pyramid net with attention partial module and differentiable auto-data augmentation.
We define a new pose search space where the sequences of data augmentations are formulated as a trainable and operational CNN component.
Notably, our method achieves the top-1 accuracy on the challenging COCO keypoint benchmark and the state-of-the-art results on the MPII datasets.
arXiv Detail & Related papers (2020-03-17T03:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.