Exploiting Diffusion Prior for Generalizable Dense Prediction
- URL: http://arxiv.org/abs/2311.18832v2
- Date: Tue, 2 Apr 2024 17:59:33 GMT
- Title: Exploiting Diffusion Prior for Generalizable Dense Prediction
- Authors: Hsin-Ying Lee, Hung-Yu Tseng, Hsin-Ying Lee, Ming-Hsuan Yang,
- Abstract summary: Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
- Score: 85.4563592053464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contents generated by recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate due to the immitigable domain gap. We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks. To address the misalignment between deterministic prediction tasks and stochastic T2I models, we reformulate the diffusion process through a sequence of interpolations, establishing a deterministic mapping between input RGB images and output prediction distributions. To preserve generalizability, we use low-rank adaptation to fine-tune pre-trained models. Extensive experiments across five tasks, including 3D property estimation, semantic segmentation, and intrinsic image decomposition, showcase the efficacy of the proposed method. Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
Related papers
- Discrete Modeling via Boundary Conditional Diffusion Processes [29.95155303262501]
Previous approaches have suffered from the discrepancy between discrete data and continuous modeling.
We propose a two-step forward process that first estimates the boundary as a prior distribution.
We then rescales the forward trajectory to construct a boundary conditional diffusion model.
arXiv Detail & Related papers (2024-10-29T09:42:42Z) - Empirical Bayesian image restoration by Langevin sampling with a denoising diffusion implicit prior [0.18434042562191813]
This paper presents a novel and highly computationally efficient image restoration method.
It embeds a DDPM denoiser within an empirical Bayesian Langevin algorithm.
It improves on state-of-the-art strategies both in image estimation accuracy and computing time.
arXiv Detail & Related papers (2024-09-06T16:20:24Z) - Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image Reconstruction [31.503662384666274]
In science and engineering, the goal is to infer an unknown image from a small number of measurements collected from a known forward model describing certain imaging modality.
Motivated Score-based diffusion models, due to its empirical success, have emerged as an impressive candidate of an exemplary prior in image reconstruction.
arXiv Detail & Related papers (2024-03-25T15:58:26Z) - Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation [2.5556910002263984]
Score-based diffusion models (SBDM) have emerged as state-of-the-art approaches for image generation.
This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain.
We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting.
arXiv Detail & Related papers (2023-03-08T18:10:10Z) - Patch-level Gaze Distribution Prediction for Gaze Following [49.93340533068501]
We introduce the patch distribution prediction ( PDP) method for gaze following training.
We show that our model regularizes the MSE loss by predicting better heatmap distributions on images with larger annotation variances.
Experiments show that our model bridging the gap between the target prediction and in/out prediction subtasks, showing a significant improvement on both subtasks on public gaze following datasets.
arXiv Detail & Related papers (2022-11-20T19:25:15Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences.
We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction.
Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z) - Learning Accurate Dense Correspondences and When to Trust Them [161.76275845530964]
We aim to estimate a dense flow field relating two images, coupled with a robust pixel-wise confidence map.
We develop a flexible probabilistic approach that jointly learns the flow prediction and its uncertainty.
Our approach obtains state-of-the-art results on challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-01-05T18:54:11Z) - Calibrated Adversarial Refinement for Stochastic Semantic Segmentation [5.849736173068868]
We present a strategy for learning a calibrated predictive distribution over semantic maps, where the probability associated with each prediction reflects its ground truth correctness likelihood.
We demonstrate the versatility and robustness of the approach by achieving state-of-the-art results on the multigrader LIDC dataset and on a modified Cityscapes dataset with injected ambiguities.
We show that the core design can be adapted to other tasks requiring learning a calibrated predictive distribution by experimenting on a toy regression dataset.
arXiv Detail & Related papers (2020-06-23T16:39:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.