Related papers: PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

URL: http://arxiv.org/abs/2404.10620v1
Date: Tue, 16 Apr 2024 14:43:33 GMT
Title: PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction
Authors: Sinisa Stekovic, Stefan Ainetter, Mattia D'Urso, Friedrich Fraundorfer, Vincent Lepetit,
Abstract summary: We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions.
Score: 27.98261111833689
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of shape programs for 3D reconstruction allows for reasoning about the semantic properties of reconstructed objects, editing, low memory footprint, etc. However, the utilization of shape programs for 3D scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing a module that translates shape programs designed in Blender, for example, into efficient PyTorch code. We also provide a method that relies on PyTorchGeoNodes and is inspired by Monte Carlo Tree Search (MCTS) to jointly optimize discrete and continuous parameters of shape programs and reconstruct 3D objects for input scenes. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions. Our experiments indicate that our reconstructions match well the input scenes while enabling semantic reasoning about reconstructed objects.

Related papers

FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [93.6881532277553]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images. Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z)
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory. Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images. GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z)
3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis [49.352765055181436]
We propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis. Our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction.
arXiv Detail & Related papers (2024-04-09T12:47:30Z)
ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point Clouds [52.03819676074455]
ParaPoint is an unsupervised neural learning pipeline for achieving global free-boundary surface parameterization. This work makes the first attempt to investigate neural point cloud parameterization that pursues both global mappings and free boundaries.
arXiv Detail & Related papers (2024-03-15T14:35:05Z)
3DMiner: Discovering Shapes from Large-Scale Unannotated Image Datasets [34.610546020800236]
3DMiner is a pipeline for mining 3D shapes from challenging datasets. Our method is capable of producing significantly better results than state-of-the-art unsupervised 3D reconstruction techniques. We show how 3DMiner can be applied to in-the-wild data by reconstructing shapes present in images from the LAION-5B dataset.
arXiv Detail & Related papers (2023-10-29T23:08:19Z)
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction. Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z)
Deep Learning Assisted Optimization for 3D Reconstruction from Single 2D Line Drawings [13.532686360047574]
We propose to train deep neural networks to detect pairwise relationships among geometric entities in 3D objects. Experiments on a large dataset of CAD models show that, by leveraging deep learning in a geometric constraint solving pipeline, the success rate of optimization-based 3D reconstruction can be significantly improved.
arXiv Detail & Related papers (2022-09-06T17:59:11Z)
3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow [61.62796058294777]
Reconstructing 3D shape from a single 2D image is a challenging task. Most of the previous methods still struggle to extract semantic attributes for 3D reconstruction task. We propose 3DAttriFlow to disentangle and extract semantic attributes through different semantic levels in the input images.
arXiv Detail & Related papers (2022-03-29T02:03:31Z)
Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images [12.693545159861857]
We propose a novel deep reinforcement learning-based approach for 3D object reconstruction from monocular images. The proposed method outperforms the state-of-the-art in terms of visual quality, reconstruction accuracy, and computational time.
arXiv Detail & Related papers (2021-09-24T09:44:22Z)
DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension [71.71234436165255]
We contribute DensePose 3D, a method that can learn such reconstructions in a weakly supervised fashion from 2D image annotations only. Because it does not require 3D scans, DensePose 3D can be used for learning a wide range of articulated categories such as different animal species. We show significant improvements compared to state-of-the-art non-rigid structure-from-motion baselines on both synthetic and real data on categories of humans and animals.
arXiv Detail & Related papers (2021-08-31T18:33:55Z)
From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes. The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z)
RISA-Net: Rotation-Invariant Structure-Aware Network for Fine-Grained 3D Shape Retrieval [46.02391761751015]
Fine-grained 3D shape retrieval aims to retrieve 3D shapes similar to a query shape in a repository with models belonging to the same class. We introduce a novel deep architecture, RISA-Net, which learns rotation invariant 3D shape descriptors. Our method is able to learn the importance of geometric and structural information of all the parts when generating the final compact latent feature of a 3D shape.
arXiv Detail & Related papers (2020-10-02T13:06:12Z)
Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs. Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z)
Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects. Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings. It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z)
Single-View 3D Object Reconstruction from Shape Priors in Memory [15.641803721287628]
Existing methods for single-view 3D object reconstruction do not contain enough information to reconstruct high-quality 3D shapes. We propose a novel method, named Mem3D, that explicitly constructs shape priors to supplement the missing information in the image. We also propose a voxel triplet loss function that helps to retrieve the precise 3D shapes that are highly related to the input image from shape priors.
arXiv Detail & Related papers (2020-03-08T03:51:07Z)
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data. IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
3D Shape Segmentation with Geometric Deep Learning [2.512827436728378]
We propose a neural-network based approach that produces 3D augmented views of the 3D shape to solve the whole segmentation as sub-segmentation problems. We validate our approach using 3D shapes of publicly available datasets and of real objects that are reconstructed using photogrammetry techniques.
arXiv Detail & Related papers (2020-02-02T14:11:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.