CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single
Image
- URL: http://arxiv.org/abs/2301.02232v2
- Date: Wed, 22 Mar 2023 21:53:02 GMT
- Title: CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single
Image
- Authors: Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Fr\'ed\'eric
Devernay
- Abstract summary: We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model.
Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image.
- Score: 41.70960551470232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a neural network approach to transfer the motion from a single
image of an articulated object to a rest-state (i.e., unarticulated) 3D model.
Our network learns to predict the object's pose, part segmentation, and
corresponding motion parameters to reproduce the articulation shown in the
input image. The network is composed of three distinct branches that take a
shared joint image-shape embedding and is trained end-to-end. Unlike previous
methods, our approach is independent of the topology of the object and can work
with objects from arbitrary categories. Our method, trained with only synthetic
data, can be used to automatically animate a mesh, infer motion from real
images, and transfer articulation to functionally similar but geometrically
distinct 3D models at test time.
Related papers
- ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - SAOR: Single-View Articulated Object Reconstruction [17.2716639564414]
We introduce SAOR, a novel approach for estimating the 3D shape, texture, and viewpoint of an articulated object from a single image captured in the wild.
Unlike prior approaches that rely on pre-defined category-specific 3D templates or tailored 3D skeletons, SAOR learns to articulate shapes from single-view image collections with a skeleton-free part-based model without requiring any 3D object shape priors.
arXiv Detail & Related papers (2023-03-23T17:59:35Z) - Template NeRF: Towards Modeling Dense Shape Correspondences from
Category-Specific Object Images [4.662583832063716]
We present neural radiance fields (NeRF) with templates, dubbed template-NeRF, for modeling appearance and geometry.
We generate dense shape correspondences simultaneously among objects of the same category from only multi-view posed images.
The learned dense correspondences can be readily used for various image-based tasks such as keypoint detection, part segmentation, and texture transfer.
arXiv Detail & Related papers (2021-11-08T02:16:48Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ
Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space?
We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.