LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human
Modeling
- URL: http://arxiv.org/abs/2208.08622v1
- Date: Thu, 18 Aug 2022 03:49:44 GMT
- Title: LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human
Modeling
- Authors: Boyan Jiang, Xinlin Ren, Mingsong Dou, Xiangyang Xue, Yanwei Fu, Yinda
Zhang
- Abstract summary: We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD.
Our key insight is to encourage the network to learn the latent codes of local part-level representation.
LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
- Score: 69.56581851211841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in 4D implicit representation focuses on globally controlling
the shape and motion with low dimensional latent vectors, which is prone to
missing surface details and accumulating tracking error. While many deep local
representations have shown promising results for 3D shape modeling, their 4D
counterpart does not exist yet. In this paper, we fill this blank by proposing
a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD,
which has the merits of both 4D human modeling and local representation, and
enables high-fidelity reconstruction with detailed surface deformations, such
as clothing wrinkles. Particularly, our key insight is to encourage the network
to learn the latent codes of local part-level representation, capable of
explaining the local geometry and temporal deformations. To make the inference
at test-time, we first estimate the inner body skeleton motion to track local
parts at each time step, and then optimize the latent codes for each part via
auto-decoding based on different types of observed data. Extensive experiments
demonstrate that the proposed method has strong capability for representing 4D
human, and outperforms state-of-the-art methods on practical applications,
including 4D reconstruction from sparse points, non-rigid depth fusion, both
qualitatively and quantitatively.
Related papers
- Real-time Photorealistic Dynamic Scene Representation and Rendering with
4D Gaussian Splatting [8.078460597825142]
Reconstructing dynamic 3D scenes from 2D images and generating diverse views over time is challenging due to scene complexity and temporal dynamics.
We propose to approximate the underlying-temporal rendering volume of a dynamic scene by optimizing a collection of 4D primitives, with explicit geometry and appearance modeling.
Our model is conceptually simple, consisting of a 4D Gaussian parameterized by anisotropic ellipses that can rotate arbitrarily in space and time, as well as view-dependent and time-evolved appearance represented by the coefficient of 4D spherindrical harmonics.
arXiv Detail & Related papers (2023-10-16T17:57:43Z) - Neural Poisson: Indicator Functions for Neural Fields [25.41908065938424]
Implicit neural field generating signed distance field representations (SDFs) of 3D shapes have shown remarkable progress.
We introduce a new paradigm for neural field representations of 3D scenes.
We show that our approach demonstrates state-of-the-art reconstruction performance on both synthetic and real scanned 3D scene data.
arXiv Detail & Related papers (2022-11-25T17:28:22Z) - Deep Generative Models on 3D Representations: A Survey [81.73385191402419]
Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
arXiv Detail & Related papers (2022-10-27T17:59:50Z) - Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model [76.64071133839862]
Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications.
Our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can operate on monocular RGB videos directly by using differentiable volume rendering.
Results on our new dataset, which will be made publicly available, demonstrate a clear improvement over the state of the art in terms of surface reconstruction accuracy and robustness to large deformations.
arXiv Detail & Related papers (2022-06-16T17:59:54Z) - H4D: Human 4D Modeling by Learning Neural Compositional Representation [75.34798886466311]
This work presents a novel framework that can effectively learn a compact and compositional representation for dynamic human.
A simple yet effective linear motion model is proposed to provide a rough and regularized motion estimation.
Experiments demonstrate our method is not only efficacy in recovering dynamic human with accurate motion and detailed geometry, but also amenable to various 4D human related tasks.
arXiv Detail & Related papers (2022-03-02T17:10:49Z) - 4D-Net for Learned Multi-Modal Alignment [87.58354992455891]
We present 4D-Net, a 3D object detection approach, which utilizes 3D Point Cloud and RGB sensing information, both in time.
We are able to incorporate the 4D information by performing a novel connection learning across various feature representations and levels of abstraction, as well as by observing geometric constraints.
arXiv Detail & Related papers (2021-09-02T16:35:00Z) - 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [7.637832293935966]
We introduce 4DComplete, a novel data-driven approach that estimates the non-rigid motion for the unobserved geometry.
For network training, we constructed a large-scale synthetic dataset called DeformingThings4D.
arXiv Detail & Related papers (2021-05-05T07:39:12Z) - Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors
for Efficient and Robust 4D Reconstruction [43.60322886598972]
This paper focuses on the task of 4D shape reconstruction from a sequence of point clouds.
We present a novel pipeline to learn a temporal evolution of the 3D human shape through capturing continuous transformation functions among cross-frame occupancy fields.
arXiv Detail & Related papers (2021-03-30T13:36:03Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.