DeFormer: Integrating Transformers with Deformable Models for 3D Shape
Abstraction from a Single Image
- URL: http://arxiv.org/abs/2309.12594v2
- Date: Tue, 3 Oct 2023 21:31:01 GMT
- Title: DeFormer: Integrating Transformers with Deformable Models for 3D Shape
Abstraction from a Single Image
- Authors: Di Liu, Xiang Yu, Meng Ye, Qilong Zhangli, Zhuowei Li, Zhixing Zhang,
Dimitris N. Metaxas
- Abstract summary: We propose a novel bi-channel Transformer architecture, integrated with parameterized deformable models, to simultaneously estimate the global and local deformations of primitives.
DeFormer achieves better reconstruction accuracy over the state-of-the-art, and visualizes with consistent semantic correspondences for improved interpretability.
- Score: 31.154786931081087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate 3D shape abstraction from a single 2D image is a long-standing
problem in computer vision and graphics. By leveraging a set of primitives to
represent the target shape, recent methods have achieved promising results.
However, these methods either use a relatively large number of primitives or
lack geometric flexibility due to the limited expressibility of the primitives.
In this paper, we propose a novel bi-channel Transformer architecture,
integrated with parameterized deformable models, termed DeFormer, to
simultaneously estimate the global and local deformations of primitives. In
this way, DeFormer can abstract complex object shapes while using a small
number of primitives which offer a broader geometry coverage and finer details.
Then, we introduce a force-driven dynamic fitting and a cycle-consistent
re-projection loss to optimize the primitive parameters. Extensive experiments
on ShapeNet across various settings show that DeFormer achieves better
reconstruction accuracy over the state-of-the-art, and visualizes with
consistent semantic correspondences for improved interpretability.
Related papers
- DPF-Net: Combining Explicit Shape Priors in Deformable Primitive Field
for Unsupervised Structural Reconstruction of 3D Objects [12.713770164154461]
We present a novel unsupervised structural reconstruction method, named DPF-Net, based on a new Deformable Primitive Field representation.
The strong shape prior encoded in parameterized geometric primitives enables our DPF-Net to extract high-level structures and recover fine-grained shape details consistently.
arXiv Detail & Related papers (2023-08-25T07:50:59Z) - DTF-Net: Category-Level Pose Estimation and Shape Reconstruction via
Deformable Template Field [29.42222066097076]
Estimating 6D poses and reconstructing 3D shapes of objects in open-world scenes from RGB-depth image pairs is challenging.
We propose the DTF-Net, a novel framework for pose estimation and shape reconstruction based on implicit neural fields of object categories.
arXiv Detail & Related papers (2023-08-04T10:35:40Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - 3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces [45.18497913809082]
Primitive-based representations approximate a 3D shape mainly by a set of simple implicit primitives.
We propose a constrained implicit algebraic surface as the primitive with few learnable coefficients and higher geometrical complexities.
Our method can semantically learn segments of 3D shapes in an unsupervised manner.
arXiv Detail & Related papers (2021-08-19T12:34:28Z) - Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible
Neural Networks [118.20778308823779]
We present a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN)
Our model learns to parse 3D objects into semantically consistent part arrangements without any part-level supervision.
arXiv Detail & Related papers (2021-03-18T17:59:31Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Dense Non-Rigid Structure from Motion: A Manifold Viewpoint [162.88686222340962]
Non-Rigid Structure-from-Motion (NRSfM) problem aims to recover 3D geometry of a deforming object from its 2D feature correspondences across multiple frames.
We show that our approach significantly improves accuracy, scalability, and robustness against noise.
arXiv Detail & Related papers (2020-06-15T09:15:54Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z) - Convolutional Occupancy Networks [88.48287716452002]
We propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes.
By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space.
We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.
arXiv Detail & Related papers (2020-03-10T10:17:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.