ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently
Encode 3D Shapes
- URL: http://arxiv.org/abs/2212.06193v1
- Date: Mon, 12 Dec 2022 19:09:47 GMT
- Title: ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently
Encode 3D Shapes
- Authors: Sergey Zakharov, Rares Ambrus, Katherine Liu, Adrien Gaidon
- Abstract summary: We present a novel implicit representation to efficiently and accurately encode large datasets of complex 3D shapes.
Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%.
- Score: 32.267066838654834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compact and accurate representations of 3D shapes are central to many
perception and robotics tasks. State-of-the-art learning-based methods can
reconstruct single objects but scale poorly to large datasets. We present a
novel recursive implicit representation to efficiently and accurately encode
large datasets of complex 3D shapes by recursively traversing an implicit
octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD)
learns a hierarchically structured latent space enabling state-of-the-art
reconstruction results at a compression ratio above 99%. We also propose an
efficient curriculum learning scheme that naturally exploits the coarse-to-fine
properties of the underlying octree spatial representation. We explore the
scaling law relating latent space dimension, dataset size, and reconstruction
accuracy, showing that increasing the latent space dimension is enough to scale
to large shape datasets. Finally, we show that our learned latent space encodes
a coarse-to-fine hierarchical structure yielding reusable latents across
different levels of details, and we provide qualitative evidence of
generalization to novel shapes outside the training set.
Related papers
- Optimizing 3D Geometry Reconstruction from Implicit Neural Representations [2.3940819037450987]
Implicit neural representations have emerged as a powerful tool in learning 3D geometry.
We present a novel approach that both reduces computational expenses and enhances the capture of fine details.
arXiv Detail & Related papers (2024-10-16T16:36:23Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Neural Progressive Meshes [54.52990060976026]
We propose a method to transmit 3D meshes with a shared learned generative space.
We learn this space using a subdivision-based encoder-decoder architecture trained in advance on a large collection of surfaces.
We evaluate our method on a diverse set of complex 3D shapes and demonstrate that it outperforms baselines in terms of compression ratio and reconstruction quality.
arXiv Detail & Related papers (2023-08-10T17:58:02Z) - SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit
Neural Representations [37.733802382489515]
This paper addresses the problems of achieving large-scale 3D reconstructions with implicit representations using 3D LiDAR measurements.
We learn and store implicit features through an octree-based hierarchical structure, which is sparse and sparse.
Our experiments show that our 3D reconstructions are more accurate, complete, and memory-efficient than current state-of-the-art 3D mapping methods.
arXiv Detail & Related papers (2022-10-05T14:38:49Z) - Depth Completion using Geometry-Aware Embedding [22.333381291860498]
This paper proposes an efficient method to learn geometry-aware embedding.
It encodes the local and global geometric structure information from 3D points, e.g., scene layout, object's sizes and shapes, to guide dense depth estimation.
arXiv Detail & Related papers (2022-03-21T12:06:27Z) - High-fidelity 3D Model Compression based on Key Spheres [6.59007277780362]
We propose an SDF prediction network using explicit key spheres as input.
Our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.
arXiv Detail & Related papers (2022-01-19T09:21:54Z) - OctField: Hierarchical Implicit Functions for 3D Modeling [18.488778913029805]
We present a learnable hierarchical implicit representation for 3D surfaces, coded OctField, that allows high-precision encoding of intricate surfaces with low memory and computational budget.
We achieve this goal by introducing a hierarchical octree structure to adaptively subdivide the 3D space according to the surface occupancy and the richness of part geometry.
arXiv Detail & Related papers (2021-11-01T16:29:39Z) - UCLID-Net: Single View Reconstruction in Object Space [60.046383053211215]
We show that building a geometry preserving 3-dimensional latent space helps the network concurrently learn global shape regularities and local reasoning in the object coordinate space.
We demonstrate both on ShapeNet synthetic images, which are often used for benchmarking purposes, and on real-world images that our approach outperforms state-of-the-art ones.
arXiv Detail & Related papers (2020-06-06T09:15:56Z) - 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure
Prior [50.73148041205675]
The goal of the Semantic Scene Completion (SSC) task is to simultaneously predict a completed 3D voxel representation of volumetric occupancy and semantic labels of objects in the scene from a single-view observation.
We propose to devise a new geometry-based strategy to embed depth information with low-resolution voxel representation.
Our proposed geometric embedding works better than the depth feature learning from habitual SSC frameworks.
arXiv Detail & Related papers (2020-03-31T09:33:46Z) - Convolutional Occupancy Networks [88.48287716452002]
We propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes.
By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space.
We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.
arXiv Detail & Related papers (2020-03-10T10:17:07Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.