POC-SLT: Partial Object Completion with SDF Latent Transformers
- URL: http://arxiv.org/abs/2411.05419v1
- Date: Fri, 08 Nov 2024 09:13:20 GMT
- Title: POC-SLT: Partial Object Completion with SDF Latent Transformers
- Authors: Faezeh Zakeri, Raphael Braun, Lukas Ruppert, Henrik P. A. Lensch,
- Abstract summary: 3D geometric shape completion hinges on representation learning and a deep understanding of geometric data.
We propose a transformer operating on the latent space representing Signed Distance Fields (SDFs)
Instead of a monolithic volume, the SDF of an object is partitioned into smaller high-resolution patches leading to a sequence of latent codes.
- Score: 1.5999407512883512
- License:
- Abstract: 3D geometric shape completion hinges on representation learning and a deep understanding of geometric data. Without profound insights into the three-dimensional nature of the data, this task remains unattainable. Our work addresses this challenge of 3D shape completion given partial observations by proposing a transformer operating on the latent space representing Signed Distance Fields (SDFs). Instead of a monolithic volume, the SDF of an object is partitioned into smaller high-resolution patches leading to a sequence of latent codes. The approach relies on a smooth latent space encoding learned via a variational autoencoder (VAE), trained on millions of 3D patches. We employ an efficient masked autoencoder transformer to complete partial sequences into comprehensive shapes in latent space. Our approach is extensively evaluated on partial observations from ShapeNet and the ABC dataset where only fractions of the objects are given. The proposed POC-SLT architecture compares favorably with several baseline state-of-the-art methods, demonstrating a significant improvement in 3D shape completion, both qualitatively and quantitatively.
Related papers
- A Recipe for Geometry-Aware 3D Mesh Transformers [2.0992612407358293]
We study an approach for embedding features at the patch level, accommodating patches with variable node counts.
Our research highlights critical insights: 1) the importance of structural and positional embeddings facilitated by heat diffusion in general 3D mesh transformers; 2) the effectiveness of novel components such as geodesic masking and feature interaction via cross-attention in enhancing learning; and 3) the superior performance and efficiency of our proposed methods in challenging segmentation and classification tasks.
arXiv Detail & Related papers (2024-10-31T19:13:31Z) - NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z) - LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset [17.530432165466507]
We present a novel Cross-Modal Shape Reconstruction (DisCo) method and an Occupancy-Guided 3D Object Detection (OccGOD) method.
Our methods achieve state-of-the-art performance in both instance-level scene reconstruction and 3D object detection tasks.
arXiv Detail & Related papers (2023-12-19T18:50:10Z) - 3DSGrasp: 3D Shape-Completion for Robotic Grasp [81.1638620745356]
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available.
In practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action.
We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses.
arXiv Detail & Related papers (2023-01-02T20:23:05Z) - ShapeFormer: Transformer-based Shape Completion via Sparse
Representation [41.33457875133559]
We present ShapeFormer, a network that produces a distribution of object completions conditioned on incomplete, and possibly noisy, point clouds.
The resultant distribution can then be sampled to generate likely completions, each exhibiting plausible shape details while being faithful to the input.
arXiv Detail & Related papers (2022-01-25T13:58:30Z) - High-fidelity 3D Model Compression based on Key Spheres [6.59007277780362]
We propose an SDF prediction network using explicit key spheres as input.
Our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.
arXiv Detail & Related papers (2022-01-19T09:21:54Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Deep Implicit Templates for 3D Shape Representation [70.9789507686618]
We propose a new 3D shape representation that supports explicit correspondence reasoning in deep implicit representations.
Our key idea is to formulate DIFs as conditional deformations of a template implicit function.
We show that our method can not only learn a common implicit template for a collection of shapes, but also establish dense correspondences across all the shapes simultaneously without any supervision.
arXiv Detail & Related papers (2020-11-30T06:01:49Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.