A Fast Hybrid Cascade Network for Voxel-based 3D Object Classification
- URL: http://arxiv.org/abs/2011.04522v3
- Date: Fri, 28 Apr 2023 02:30:13 GMT
- Title: A Fast Hybrid Cascade Network for Voxel-based 3D Object Classification
- Authors: Ji Luo, Hui Cao, Jie Wang, Siyu Zhang and Shen Cai
- Abstract summary: We propose a hybrid cascade architecture for voxel-based 3D object classification.
Both accuracy and speed can be balanced in our proposed method.
- Score: 10.019858113123822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Voxel-based 3D object classification has been thoroughly studied in recent
years. Most previous methods convert the classic 2D convolution into a 3D form
that will be further applied to objects with binary voxel representation for
classification. However, the binary voxel representation is not very effective
for 3D convolution in many cases. In this paper, we propose a hybrid cascade
architecture for voxel-based 3D object classification. It consists of three
stages composed of fully connected and convolutional layers, dealing with easy,
moderate, and hard 3D models respectively. Both accuracy and speed can be
balanced in our proposed method. By giving each voxel a signed distance value,
an obvious gain regarding the accuracy can be observed. Besides, the mean
inference time can be speeded up hugely compared with the state-of-the-art
point cloud and voxel based methods.
Related papers
- DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [78.25819070166351]
We propose VoxelNext for fully sparse 3D object detection.
Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.
Our strong sparse convolutional network VoxelNeXt detects and tracks 3D objects through voxel features entirely.
arXiv Detail & Related papers (2023-03-20T17:40:44Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs.
They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion.
For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z) - FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection [3.330229314824913]
We present FCAF3D - a first-in-class fully convolutional anchor-free indoor 3D object detection method.
It is a simple yet effective method that uses a voxel representation of a point cloud and processes voxels with sparse convolutions.
It can handle large-scale scenes with minimal runtime through a single fully convolutional feed-forward pass.
arXiv Detail & Related papers (2021-12-01T07:28:52Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - HyperCube: Implicit Field Representations of Voxelized 3D Models [18.868266675878996]
We introduce a new HyperCube architecture that enables direct processing of 3D voxels.
Instead of processing individual 3D samples from within a voxel, our approach allows to input the entire voxel represented with its convex hull coordinates.
arXiv Detail & Related papers (2021-10-12T06:56:48Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - Anisotropic Convolutional Networks for 3D Semantic Scene Completion [24.9671648682339]
semantic scene completion (SSC) tries to simultaneously infer the occupancy and semantic labels for a scene from a single depth and/or RGB image.
We propose a novel module called anisotropic convolution, which properties with flexibility and power impossible for competing methods.
In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.
arXiv Detail & Related papers (2020-04-05T07:57:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.