SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting
1D Occupancy Segments From 2D Coordinates
- URL: http://arxiv.org/abs/2003.05559v2
- Date: Mon, 16 Mar 2020 15:06:39 GMT
- Title: SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting
1D Occupancy Segments From 2D Coordinates
- Authors: Zhizhong Han, Guanhui Qiao, Yu-Shen Liu, and Matthias Zwicker
- Abstract summary: We propose to represent 3D shapes using 2D functions, where the output of the function at each 2D location is a sequence of line segments inside the shape.
We implement this approach using a Seq2Seq model with attention, called SeqXY2SeqZ, which learns the mapping from a sequence of 2D coordinates along two arbitrary axes to a sequence of 1D locations along the third axis.
Our experiments show that SeqXY2SeqZ outperforms the state-ofthe-art methods under widely used benchmarks.
- Score: 61.04823927283092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structure learning for 3D shapes is vital for 3D computer vision.
State-of-the-art methods show promising results by representing shapes using
implicit functions in 3D that are learned using discriminative neural networks.
However, learning implicit functions requires dense and irregular sampling in
3D space, which also makes the sampling methods affect the accuracy of shape
reconstruction during test. To avoid dense and irregular sampling in 3D, we
propose to represent shapes using 2D functions, where the output of the
function at each 2D location is a sequence of line segments inside the shape.
Our approach leverages the power of functional representations, but without the
disadvantage of 3D sampling. Specifically, we use a voxel tubelization to
represent a voxel grid as a set of tubes along any one of the X, Y, or Z axes.
Each tube can be indexed by its 2D coordinates on the plane spanned by the
other two axes. We further simplify each tube into a sequence of occupancy
segments. Each occupancy segment consists of successive voxels occupied by the
shape, which leads to a simple representation of its 1D start and end location.
Given the 2D coordinates of the tube and a shape feature as condition, this
representation enables us to learn 3D shape structures by sequentially
predicting the start and end locations of each occupancy segment in the tube.
We implement this approach using a Seq2Seq model with attention, called
SeqXY2SeqZ, which learns the mapping from a sequence of 2D coordinates along
two arbitrary axes to a sequence of 1D locations along the third axis.
SeqXY2SeqZ not only benefits from the regularity of voxel grids in training and
testing, but also achieves high memory efficiency. Our experiments show that
SeqXY2SeqZ outperforms the state-ofthe-art methods under widely used
benchmarks.
Related papers
- Occupancy-Based Dual Contouring [12.944046673902415]
We introduce a dual contouring method that provides state-of-the-art performance for occupancy functions.
Our method is learning-free and carefully designed to maximize the use of GPU parallelization.
arXiv Detail & Related papers (2024-09-20T11:32:21Z) - Multi-View Representation is What You Need for Point-Cloud Pre-Training [22.55455166875263]
This paper proposes a novel approach to point-cloud pre-training that learns 3D representations by leveraging pre-trained 2D networks.
We train the 3D feature extraction network with the help of the novel 2D knowledge transfer loss.
Experimental results demonstrate that our pre-trained model can be successfully transferred to various downstream tasks.
arXiv Detail & Related papers (2023-06-05T03:14:54Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth
Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z) - Meta-Learning 3D Shape Segmentation Functions [16.119694625781992]
We introduce an auxiliary deep neural network as a meta-learner which takes as input a 3D shape and predicts the prior over the respective 3D segmentation function space.
We show in experiments that our meta-learning approach, denoted as Meta-3DSeg, leads to improvements on unsupervised 3D shape segmentation.
arXiv Detail & Related papers (2021-10-08T01:50:54Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - KAPLAN: A 3D Point Descriptor for Shape Completion [80.15764700137383]
KAPLAN is a 3D point descriptor that aggregates local shape information via a series of 2D convolutions.
In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder.
Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.
arXiv Detail & Related papers (2020-07-31T21:56:08Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - 3D Shape Segmentation with Geometric Deep Learning [2.512827436728378]
We propose a neural-network based approach that produces 3D augmented views of the 3D shape to solve the whole segmentation as sub-segmentation problems.
We validate our approach using 3D shapes of publicly available datasets and of real objects that are reconstructed using photogrammetry techniques.
arXiv Detail & Related papers (2020-02-02T14:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.