MoDA: Modeling Deformable 3D Objects from Casual Videos
- URL: http://arxiv.org/abs/2304.08279v3
- Date: Wed, 19 Jun 2024 15:24:30 GMT
- Title: MoDA: Modeling Deformable 3D Objects from Casual Videos
- Authors: Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan Sheng Foo, Fayao Liu, Guosheng Lin,
- Abstract summary: We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts.
In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space.
Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
- Score: 84.29654142118018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of neural radiance fields (NeRF), many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects. Extensive experiments on real and synthetic datasets show that our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods. Project page: \url{https://chaoyuesong.github.io/MoDA}.
Related papers
- Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation [66.7719069053058]
DeformGS is an approach to recover scene flow in highly deformable scenes using simultaneous video captures of a dynamic scene from multiple cameras.
DeformGS improves 3D tracking by an average of 55.8% compared to the state-of-the-art.
With sufficient texture, DeformGS achieves a median tracking error of 3.3 mm on a cloth of 1.5 x 1.5 m in area.
arXiv Detail & Related papers (2023-11-30T18:53:03Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Disentangled3D: Learning a 3D Generative Model with Disentangled
Geometry and Appearance from Monocular Images [94.49117671450531]
State-of-the-art 3D generative models are GANs which use neural 3D volumetric representations for synthesis.
In this paper, we design a 3D GAN which can learn a disentangled model of objects, just from monocular observations.
arXiv Detail & Related papers (2022-03-29T22:03:18Z) - Animatable Implicit Neural Representations for Creating Realistic
Avatars from Videos [63.16888987770885]
This paper addresses the challenge of reconstructing an animatable human model from a multi-view video.
We introduce a pose-driven deformation field based on the linear blend skinning algorithm.
We show that our approach significantly outperforms recent human modeling methods.
arXiv Detail & Related papers (2022-03-15T17:56:59Z) - BANMo: Building Animatable 3D Neural Models from Many Casual Videos [135.64291166057373]
We present BANMo, a method that requires neither a specialized sensor nor a pre-defined template shape.
Banmo builds high-fidelity, articulated 3D models from many monocular casual videos in a differentiable rendering framework.
On real and synthetic datasets, BANMo shows higher-fidelity 3D reconstructions than prior works for humans and animals.
arXiv Detail & Related papers (2021-12-23T18:30:31Z) - Convolutional Generation of Textured 3D Meshes [34.20939983046376]
We propose a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images.
A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN.
We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text.
arXiv Detail & Related papers (2020-06-13T15:23:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.