DiffSF: Diffusion Models for Scene Flow Estimation
- URL: http://arxiv.org/abs/2403.05327v3
- Date: Fri, 04 Oct 2024 13:37:08 GMT
- Title: DiffSF: Diffusion Models for Scene Flow Estimation
- Authors: Yushan Zhang, Bastian Wandt, Maria Magnusson, Michael Felsberg,
- Abstract summary: We propose DiffSF that combines transformer-based scene flow estimation with denoising diffusion models.
We show that the diffusion process greatly increases the robustness of predictions compared to prior approaches.
By sampling multiple times with different initial states, the denoising process predicts multiple hypotheses, which enables measuring the output uncertainty.
- Score: 17.512660491303684
- License:
- Abstract: Scene flow estimation is an essential ingredient for a variety of real-world applications, especially for autonomous agents, such as self-driving cars and robots. While recent scene flow estimation approaches achieve a reasonable accuracy, their applicability to real-world systems additionally benefits from a reliability measure. Aiming at improving accuracy while additionally providing an estimate for uncertainty, we propose DiffSF that combines transformer-based scene flow estimation with denoising diffusion models. In the diffusion process, the ground truth scene flow vector field is gradually perturbed by adding Gaussian noise. In the reverse process, starting from randomly sampled Gaussian noise, the scene flow vector field prediction is recovered by conditioning on a source and a target point cloud. We show that the diffusion process greatly increases the robustness of predictions compared to prior approaches resulting in state-of-the-art performance on standard scene flow estimation benchmarks. Moreover, by sampling multiple times with different initial states, the denoising process predicts multiple hypotheses, which enables measuring the output uncertainty, allowing our approach to detect a majority of the inaccurate predictions. The code is available at https://github.com/ZhangYushan3/DiffSF.
Related papers
- Diffusion Priors for Variational Likelihood Estimation and Image Denoising [10.548018200066858]
We propose adaptive likelihood estimation and MAP inference during the reverse diffusion process to tackle real-world noise.
Experiments and analyses on diverse real-world datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-10-23T02:52:53Z) - Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting [19.383395337330082]
We propose a generic channel-aware Contrastive Conditional Diffusion model entitled CCDM.
The proposed CCDM can exhibit superior forecasting capability compared to current state-of-the-art diffusion forecasters.
arXiv Detail & Related papers (2024-10-03T03:13:15Z) - Modeling State Shifting via Local-Global Distillation for Event-Frame Gaze Tracking [61.44701715285463]
This paper tackles the problem of passive gaze estimation using both event and frame data.
We reformulate gaze estimation as the quantification of the state shifting from the current state to several prior registered anchor states.
To improve the generalization ability, instead of learning a large gaze estimation network directly, we align a group of local experts with a student network.
arXiv Detail & Related papers (2024-03-31T03:30:37Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Diffusion Model [20.15214479105187]
We propose a novel uncertainty-aware scene flow estimation network (DifFlow3D) with the diffusion probabilistic model.
Our method achieves an unprecedented millimeter-level accuracy (0.0078m in EPE3D) on the KITTI dataset.
arXiv Detail & Related papers (2023-11-29T08:56:24Z) - Direct Unsupervised Denoising [60.71146161035649]
Unsupervised denoisers do not directly produce a single prediction, such as the MMSE estimate.
We present an alternative approach that trains a deterministic network alongside the VAE to directly predict a central tendency.
arXiv Detail & Related papers (2023-10-27T13:02:12Z) - Benchmarking Autoregressive Conditional Diffusion Models for Turbulent
Flow Simulation [29.806100463356906]
We analyze if fully data-driven fluid solvers that utilize an autoregressive rollout based on conditional diffusion models are a viable option.
We investigate accuracy, posterior sampling, spectral behavior, and temporal stability, while requiring that methods generalize to flow parameters beyond the training regime.
We find that even simple diffusion-based approaches can outperform multiple established flow prediction methods in terms of accuracy and temporal stability, while being on par with state-of-the-art stabilization techniques like unrolling at training time.
arXiv Detail & Related papers (2023-09-04T18:01:42Z) - Pedestrian Trajectory Forecasting Using Deep Ensembles Under Sensing
Uncertainty [125.41260574344933]
We consider an encoder-decoder based deep ensemble network for capturing both perception and predictive uncertainty simultaneously.
Overall, deep ensembles provided more robust predictions and the consideration of upstream uncertainty further increased the estimation accuracy for the model.
arXiv Detail & Related papers (2023-05-26T04:27:48Z) - DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion [137.8749239614528]
We propose a new formulation of temporal action detection (TAD) with denoising diffusion, DiffTAD.
Taking as input random temporal proposals, it can yield action proposals accurately given an untrimmed long video.
arXiv Detail & Related papers (2023-03-27T00:40:52Z) - Bayesian Sparse Regression for Mixed Multi-Responses with Application to
Runtime Metrics Prediction in Fog Manufacturing [6.288767115532775]
Fog manufacturing can greatly enhance traditional manufacturing systems through distributed computation Fog units.
It is known that the predictive offloading methods highly depend on accurate prediction and uncertainty quantification of runtime performance metrics.
We propose a Bayesian sparse regression for multivariate mixed responses to enhance the prediction of runtime performance metrics.
arXiv Detail & Related papers (2022-10-10T16:14:08Z) - Quantifying Uncertainty in Deep Spatiotemporal Forecasting [67.77102283276409]
We describe two types of forecasting problems: regular grid-based and graph-based.
We analyze UQ methods from both the Bayesian and the frequentist point view, casting in a unified framework via statistical decision theory.
Through extensive experiments on real-world road network traffic, epidemics, and air quality forecasting tasks, we reveal the statistical computational trade-offs for different UQ methods.
arXiv Detail & Related papers (2021-05-25T14:35:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.