Related papers: DPHMs: Diffusion Parametric Head Models for Depth-based Tracking

DPHMs: Diffusion Parametric Head Models for Depth-based Tracking

URL: http://arxiv.org/abs/2312.01068v2
Date: Mon, 8 Apr 2024 14:33:12 GMT
Title: DPHMs: Diffusion Parametric Head Models for Depth-based Tracking
Authors: Jiapeng Tang, Angela Dai, Yinyu Nie, Lev Markhasin, Justus Thies, Matthias Niessner,
Abstract summary: We introduce Diffusion Parametric Head Models (DPHMs) DPHMs are a generative model that enables robust volumetric head reconstruction and tracking from monocular depth sequences. We propose a latent diffusion-based prior to regularize volumetric head reconstruction and tracking.
Score: 42.016598097736626
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We introduce Diffusion Parametric Head Models (DPHMs), a generative model that enables robust volumetric head reconstruction and tracking from monocular depth sequences. While recent volumetric head models, such as NPHMs, can now excel in representing high-fidelity head geometries, tracking and reconstructing heads from real-world single-view depth sequences remains very challenging, as the fitting to partial and noisy observations is underconstrained. To tackle these challenges, we propose a latent diffusion-based prior to regularize volumetric head reconstruction and tracking. This prior-based regularizer effectively constrains the identity and expression codes to lie on the underlying latent manifold which represents plausible head shapes. To evaluate the effectiveness of the diffusion-based prior, we collect a dataset of monocular Kinect sequences consisting of various complex facial expression motions and rapid transitions. We compare our method to state-of-the-art tracking methods and demonstrate improved head identity reconstruction as well as robust expression tracking.

Related papers

Geometry-Constrained Monocular Scale Estimation Using Semantic Segmentation for Dynamic Scenes [3.635236692041662]
This study presents innovative strategies for ego-motion estimation and the selection of ground points. Our methodology incorporates dy-namic object masks to eliminate unstable features and employs ground plane masks for meticulous triangulation. The integration of this approach with the mo-nocular version of ORB-SLAM3 culminates in the accurate esti-mation of a road model.
arXiv Detail & Related papers (2025-03-06T09:15:13Z)
Dfilled: Repurposing Edge-Enhancing Diffusion for Guided DSM Void Filling [2.3020018305241337]
Digital Surface Models (DSMs) are essential for accurately representing Earth's topography in geospatial analyses. DSMs capture detailed elevations of natural and manmade features, crucial for applications like urban planning, vegetation studies, and 3D reconstruction. Previous studies have primarily focused on void filling for digital elevation models (DEMs) and Digital Terrain Models (DTMs) We introduce Dfilled, a guided DSM void filling method that leverages optical remote sensing images through edge-enhancing diffusion.
arXiv Detail & Related papers (2025-01-26T08:03:02Z)
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference. Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z)
GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth [2.805351469151152]
We propose a novel constraint on ground areas designed specifically for the self-supervised paradigm. This mechanism not only allows to accurately recover the scale but also ensures coherence between the depth prediction and the ground prior.
arXiv Detail & Related papers (2024-09-23T09:30:27Z)
Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging [8.819370643243012]
Coded Aperture Snapshot Spectral Imaging (CASSI) is a crucial technique for capturing three-dimensional multispectral images (MSIs) Current state-of-the-art methods, predominantly end-to-end, face limitations in reconstructing high-frequency details. This paper introduces a novel one-step Diffusion Probabilistic Model within a self-supervised adaptation framework for Snapshot Compressive Imaging.
arXiv Detail & Related papers (2024-09-11T17:02:10Z)
Stratified Avatar Generation from Sparse Observations [10.291918304187769]
Estimating 3D full-body avatars from AR/VR devices is essential for creating immersive experiences. In this paper, we are inspired by the inherent property of the kinematic tree defined in the Skinned Multi-Person Linear (SMPL) model. We propose a stratified approach to decouple the conventional full-body avatar reconstruction pipeline into two stages.
arXiv Detail & Related papers (2024-05-30T06:25:42Z)
Semi-Supervised Unconstrained Head Pose Estimation in the Wild [60.08319512840091]
We propose the first semi-supervised unconstrained head pose estimation method SemiUHPE. Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment. Experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks.
arXiv Detail & Related papers (2024-04-03T08:01:00Z)
HeadRecon: High-Fidelity 3D Head Reconstruction from Monocular Video [37.53752896927615]
We study the reconstruction of high-fidelity 3D head models from arbitrary monocular videos. We propose a prior-guided dynamic implicit neural network to tackle these problems.
arXiv Detail & Related papers (2023-12-14T12:38:56Z)
CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings. We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models. Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction [72.05649682685197]
State-of-the-art neural implicit methods allow for high-quality reconstructions of simple scenes from many input views. This is caused primarily by the inherent ambiguity in the RGB reconstruction loss that does not provide enough constraints. Motivated by recent advances in the area of monocular geometry prediction, we explore the utility these cues provide for improving neural implicit surface reconstruction.
arXiv Detail & Related papers (2022-06-01T17:58:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.