Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
- URL: http://arxiv.org/abs/2111.15363v1
- Date: Tue, 30 Nov 2021 13:08:19 GMT
- Title: Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
- Authors: Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
- Abstract summary: We introduce the concept of the multi-view point cloud (Voint cloud) representing each 3D point as a set of features extracted from several view-points.
This novel 3D Voint cloud representation combines the compactness of 3D point cloud representation with the natural view-awareness of multi-view representation.
We deploy a Voint neural network (VointNet) with a theoretically established functional form to learn representations in the Voint space.
- Score: 80.04281842702294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-view projection methods have demonstrated promising performance on 3D
understanding tasks like 3D classification and segmentation. However, it
remains unclear how to combine such multi-view methods with the widely
available 3D point clouds. Previous methods use unlearned heuristics to combine
features at the point level. To this end, we introduce the concept of the
multi-view point cloud (Voint cloud), representing each 3D point as a set of
features extracted from several view-points. This novel 3D Voint cloud
representation combines the compactness of 3D point cloud representation with
the natural view-awareness of multi-view representation. Naturally, we can
equip this new representation with convolutional and pooling operations. We
deploy a Voint neural network (VointNet) with a theoretically established
functional form to learn representations in the Voint space. Our novel
representation achieves state-of-the-art performance on 3D classification and
retrieval on ScanObjectNN, ModelNet40, and ShapeNet Core55. Additionally, we
achieve competitive performance for 3D semantic segmentation on ShapeNet Parts.
Further analysis shows that VointNet improves the robustness to rotation and
occlusion compared to other methods.
Related papers
- Point Cloud Self-supervised Learning via 3D to Multi-view Masked
Autoencoder [21.73287941143304]
Multi-Modality Masked AutoEncoders (MAE) methods leverage both 2D images and 3D point clouds for pre-training.
We introduce a novel approach employing a 3D to multi-view masked autoencoder to fully harness the multi-modal attributes of 3D point clouds.
Our method outperforms state-of-the-art counterparts by a large margin in a variety of downstream tasks.
arXiv Detail & Related papers (2023-11-17T22:10:03Z) - Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost
3D Point Cloud Data-scarce Learning? [38.06639044139636]
This work proposes a novel Multi-view Vision-Prompt Fusion Network (MvNet) for few-shot 3D point cloud classification.
MvNet achieves new state-of-the-art performance for 3D few-shot point cloud image classification.
arXiv Detail & Related papers (2023-04-20T11:39:41Z) - TriVol: Point Cloud Rendering via Triple Volumes [57.305748806545026]
We present a dense while lightweight 3D representation, named TriVol, that can be combined with NeRF to render photo-realistic images from point clouds.
Our framework has excellent generalization ability to render a category of scenes/objects without fine-tuning.
arXiv Detail & Related papers (2023-03-29T06:34:12Z) - MVTN: Learning Multi-View Transformations for 3D Understanding [60.15214023270087]
We introduce the Multi-View Transformation Network (MVTN), which uses differentiable rendering to determine optimal view-points for 3D shape recognition.
MVTN can be trained end-to-end with any multi-view network for 3D shape recognition.
Our approach demonstrates state-of-the-art performance in 3D classification and shape retrieval on several benchmarks.
arXiv Detail & Related papers (2022-12-27T12:09:16Z) - PnP-3D: A Plug-and-Play for 3D Point Clouds [38.05362492645094]
We propose a plug-and-play module, -3D, to improve the effectiveness of existing networks in analyzing point cloud data.
To thoroughly evaluate our approach, we conduct experiments on three standard point cloud analysis tasks.
In addition to achieving state-of-the-art results, we present comprehensive studies to demonstrate our approach's advantages.
arXiv Detail & Related papers (2021-08-16T23:59:43Z) - From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object
Detection [101.20784125067559]
We propose a new architecture, namely Hallucinated Hollow-3D R-CNN, to address the problem of 3D object detection.
In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view.
The 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation.
arXiv Detail & Related papers (2021-07-30T02:00:06Z) - ParaNet: Deep Regular Representation for 3D Point Clouds [62.81379889095186]
ParaNet is a novel end-to-end deep learning framework for representing 3D point clouds.
It converts an irregular 3D point cloud into a regular 2D color image, named point geometry image (PGI)
In contrast to conventional regular representation modalities based on multi-view projection and voxelization, the proposed representation is differentiable and reversible.
arXiv Detail & Related papers (2020-12-05T13:19:55Z) - MVTN: Multi-View Transformation Network for 3D Shape Recognition [80.34385402179852]
We introduce the Multi-View Transformation Network (MVTN) that regresses optimal view-points for 3D shape recognition.
MVTN can be trained end-to-end along with any multi-view network for 3D shape classification.
MVTN exhibits clear performance gains in the tasks of 3D shape classification and 3D shape retrieval without the need for extra training supervision.
arXiv Detail & Related papers (2020-11-26T11:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.