Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
- URL: http://arxiv.org/abs/2406.09728v2
- Date: Mon, 04 Nov 2024 02:34:59 GMT
- Title: Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
- Authors: Seungwoo Yoo, Juil Koo, Kyeongmin Yeo, Minhyuk Sung,
- Abstract summary: We propose a novel method for learning representations of poses for 3D deformable objects.
It specializes in 1) disentangling pose information from the object's identity, 2) facilitating the learning of pose variations, and 3) transferring pose information to other object identities.
Based on these properties, our method enables the generation of 3D deformable objects with diversity in both identities and poses.
- Score: 11.614034196935899
- License:
- Abstract: We propose a novel method for learning representations of poses for 3D deformable objects, which specializes in 1) disentangling pose information from the object's identity, 2) facilitating the learning of pose variations, and 3) transferring pose information to other object identities. Based on these properties, our method enables the generation of 3D deformable objects with diversity in both identities and poses, using variations of a single object. It does not require explicit shape parameterization such as skeletons or joints, point-level or shape-level correspondence supervision, or variations of the target object for pose transfer. To achieve pose disentanglement, compactness for generative models, and transferability, we first design the pose extractor to represent the pose as a keypoint-based hybrid representation and the pose applier to learn an implicit deformation field. To better distill pose information from the object's geometry, we propose the implicit pose applier to output an intrinsic mesh property, the face Jacobian. Once the extracted pose information is transferred to the target object, the pose applier is fine-tuned in a self-supervised manner to better describe the target object's shapes with pose variations. The extracted poses are also used to train a cascaded diffusion model to enable the generation of novel poses. Our experiments with the DeformThings4D and Human datasets demonstrate state-of-the-art performance in pose transfer and the ability to generate diverse deformed shapes with various objects and poses.
Related papers
- Extreme Two-View Geometry From Object Poses with Diffusion Models [21.16779160086591]
We harness the power of object priors to accurately determine two-view geometry in the face of extreme viewpoint changes.
In experiments, our method has demonstrated extraordinary robustness and resilience to large viewpoint changes.
arXiv Detail & Related papers (2024-02-05T08:18:47Z) - Understanding Pose and Appearance Disentanglement in 3D Human Pose
Estimation [72.50214227616728]
Several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one.
We study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments.
We design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust.
arXiv Detail & Related papers (2023-09-20T22:22:21Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - Few-View Object Reconstruction with Unknown Categories and Camera Poses [80.0820650171476]
This work explores reconstructing general real-world objects from a few images without known camera poses or object categories.
The crux of our work is solving two fundamental 3D vision problems -- shape reconstruction and pose estimation.
Our method FORGE predicts 3D features from each view and leverages them in conjunction with the input images to establish cross-view correspondence.
arXiv Detail & Related papers (2022-12-08T18:59:02Z) - Single-view 3D Body and Cloth Reconstruction under Complex Poses [37.86174829271747]
We extend existing implicit function-based models to deal with images of humans with arbitrary poses and self-occluded limbs.
We learn an implicit function that maps the input image to a 3D body shape with a low level of detail.
We then learn a displacement map, conditioned on the smoothed surface, which encodes the high-frequency details of the clothes and body.
arXiv Detail & Related papers (2022-05-09T07:34:06Z) - Neural Human Deformation Transfer [26.60034186410921]
We consider the problem of human deformation transfer, where the goal is to retarget poses between different characters.
We take a different approach and transform the identity of a character into a new identity without modifying the character's pose.
We show experimentally that our method outperforms state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-09-03T15:51:30Z) - imGHUM: Implicit Generative Models of 3D Human Shape and Articulated
Pose [42.4185273307021]
We present imGHUM, the first holistic generative model of 3D human shape and articulated pose.
We model the full human body implicitly as a function zero-level-set and without the use of an explicit template mesh.
arXiv Detail & Related papers (2021-08-24T17:08:28Z) - Sparse Pose Trajectory Completion [87.31270669154452]
We propose a method to learn, even using a dataset where objects appear only in sparsely sampled views.
This is achieved with a cross-modal pose trajectory transfer mechanism.
Our method is evaluated on the Pix3D and ShapeNet datasets.
arXiv Detail & Related papers (2021-05-01T00:07:21Z) - Unsupervised 3D Human Pose Representation with Viewpoint and Pose
Disentanglement [63.853412753242615]
Learning a good 3D human pose representation is important for human pose related tasks.
We propose a novel Siamese denoising autoencoder to learn a 3D pose representation.
Our approach achieves state-of-the-art performance on two inherently different tasks.
arXiv Detail & Related papers (2020-07-14T14:25:22Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.