Related papers: Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking

Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking

URL: http://arxiv.org/abs/2401.06614v2
Date: Sat, 13 Apr 2024 09:23:21 GMT
Title: Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking
Authors: Wei Cao, Chang Luo, Biao Zhang, Matthias Nießner, Jiapeng Tang,
Abstract summary: Motion2VecSets is a 4D diffusion model for dynamic surface reconstruction from point cloud sequences. We parameterize 4D dynamics with latent sets instead of using global latent codes. For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames.
Score: 52.393359791978035
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We introduce Motion2VecSets, a 4D diffusion model for dynamic surface reconstruction from point cloud sequences. While existing state-of-the-art methods have demonstrated success in reconstructing non-rigid objects using neural field representations, conventional feed-forward networks encounter challenges with ambiguous observations from noisy, partial, or sparse point clouds. To address these challenges, we introduce a diffusion model that explicitly learns the shape and motion distribution of non-rigid objects through an iterative denoising process of compressed latent representations. The diffusion-based priors enable more plausible and probabilistic reconstructions when handling ambiguous inputs. We parameterize 4D dynamics with latent sets instead of using global latent codes. This novel 4D representation allows us to learn local shape and deformation patterns, leading to more accurate non-linear motion capture and significantly improving generalizability to unseen motions and identities. For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames. To avoid computational overhead, we designed an interleaved space and time attention block to alternately aggregate deformation latents along spatial and temporal domains. Extensive comparisons against state-of-the-art methods demonstrate the superiority of our Motion2VecSets in 4D reconstruction from various imperfect observations. More detailed information can be found at https://vveicao.github.io/projects/Motion2VecSets/.

Related papers

BulletGen: Improving 4D Reconstruction with Bullet-Time Generation [15.225127596594582]
We introduce BulletGen, an approach that takes advantage of generative models to correct errors and complete missing information in a dynamic scene representation.<n>Our method seamlessly blends generative content with both static and dynamic scene components, achieving state-of-the-art results on both novel-view synthesis, and 2D/3D tracking tasks.
arXiv Detail & Related papers (2025-06-23T13:03:42Z)
SHaDe: Compact and Consistent Dynamic 3D Reconstruction via Tri-Plane Deformation and Latent Diffusion [0.0]
We present a novel framework for dynamic 3D scene reconstruction that integrates three key components.<n>An explicit tri-plane deformation field, a view-conditioned canonical field with spherical harmonics (SH) attention, and a temporally-aware latent diffusion prior.<n>Our method encodes 4D scenes using three 2D feature planes that evolve over time, enabling efficient compact representation.
arXiv Detail & Related papers (2025-05-22T11:25:38Z)
In-2-4D: Inbetweening from Two Single-View Images to 4D Generation [54.62824686338408]
We propose a new problem, In-between2-4D, for generative 4D (i.e., 3D + motion) in Splating from a minimalistic input setting. Given two images representing the start and end states of an object in motion, our goal is to generate and reconstruct the motion in 4D.
arXiv Detail & Related papers (2025-04-11T09:01:09Z)
Temporal Residual Jacobians For Rig-free Motion Transfer [45.640576754352104]
We introduce Residual Temporal Jacobians as a novel representation to enable data-driven motion transfer. Our approach does not assume access to any rigging or intermediate shapes, produces geometrically and temporally consistent motions, and can be used to transfer long motion sequences.
arXiv Detail & Related papers (2024-07-20T18:29:22Z)
4Diffusion: Multi-view Video Diffusion Model for 4D Generation [55.82208863521353]
Current 4D generation methods have achieved noteworthy efficacy with the aid of advanced diffusion generative models. We propose a novel 4D generation pipeline, namely 4Diffusion, aimed at generating spatial-temporally consistent 4D content from a monocular video.
arXiv Detail & Related papers (2024-05-31T08:18:39Z)
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models [116.31344506738816]
We present a novel framework, textbfDiffusion4D, for efficient and scalable 4D content generation. We develop a 4D-aware video diffusion model capable of synthesizing orbital views of dynamic 3D assets. Our method surpasses prior state-of-the-art techniques in terms of generation efficiency and 4D geometry consistency.
arXiv Detail & Related papers (2024-05-26T17:47:34Z)
RoHM: Robust Human Motion Reconstruction via Diffusion [58.63706638272891]
RoHM is an approach for robust 3D human motion reconstruction from monocular RGB(-D) videos. It conditioned on noisy and occluded input data, reconstructs complete, plausible motions in consistent global coordinates. Our method outperforms state-of-the-art approaches qualitatively and quantitatively, while being faster at test time.
arXiv Detail & Related papers (2024-01-16T18:57:50Z)
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD. Our key insight is to encourage the network to learn the latent codes of local part-level representation. LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z)
Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model [76.64071133839862]
Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications. Our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can operate on monocular RGB videos directly by using differentiable volume rendering. Results on our new dataset, which will be made publicly available, demonstrate a clear improvement over the state of the art in terms of surface reconstruction accuracy and robustness to large deformations.
arXiv Detail & Related papers (2022-06-16T17:59:54Z)
4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [7.637832293935966]
We introduce 4DComplete, a novel data-driven approach that estimates the non-rigid motion for the unobserved geometry. For network training, we constructed a large-scale synthetic dataset called DeformingThings4D.
arXiv Detail & Related papers (2021-05-05T07:39:12Z)
Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction [43.60322886598972]
This paper focuses on the task of 4D shape reconstruction from a sequence of point clouds. We present a novel pipeline to learn a temporal evolution of the 3D human shape through capturing continuous transformation functions among cross-frame occupancy fields.
arXiv Detail & Related papers (2021-03-30T13:36:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.