Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
- URL: http://arxiv.org/abs/2003.06233v4
- Date: Thu, 13 Jan 2022 13:05:34 GMT
- Title: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
- Authors: Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu
- Abstract summary: We propose a novel fusion-aware 3D point convolution which operates directly on the geometric surface being reconstructed.
Global, we compile the online reconstructed 3D points into an incrementally growing coordinate interval tree.
We maintain the neighborhood information for each point using an octree whose construction benefits from the fast query of the global tree.
- Score: 19.973034777285218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online semantic 3D segmentation in company with real-time RGB-D
reconstruction poses special challenges such as how to perform 3D convolution
directly over the progressively fused 3D geometric data, and how to smartly
fuse information from frame to frame. We propose a novel fusion-aware 3D point
convolution which operates directly on the geometric surface being
reconstructed and exploits effectively the inter-frame correlation for high
quality 3D feature learning. This is enabled by a dedicated dynamic data
structure which organizes the online acquired point cloud with global-local
trees. Globally, we compile the online reconstructed 3D points into an
incrementally growing coordinate interval tree, enabling fast point insertion
and neighborhood query. Locally, we maintain the neighborhood information for
each point using an octree whose construction benefits from the fast query of
the global tree.Both levels of trees update dynamically and help the 3D
convolution effectively exploits the temporal coherence for effective
information fusion across RGB-D frames.
Related papers
- FROSS: Faster-than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images [8.271449021226417]
We propose FROSS (Faster-than-Real-Time Online 3D Semantic Scene Graph Generation), an innovative approach for online and faster-than-time 3D SSG generation.<n>This framework eliminates the dependency on precise and computationally-intensive point cloud processing.<n>Experiments show that FROSS can achieve superior performance while operating significantly faster than prior 3D generation methods.
arXiv Detail & Related papers (2025-07-26T16:16:52Z) - Self-Attention Based Multi-Scale Graph Auto-Encoder Network of 3D Meshes [1.573038298640368]
3D Geometric Mesh Network (3DGeoMeshNet), is a novel GCN-based framework that uses anisotropic convolution layers to learn both global and local features directly in the spatial domain.<n>Our architecture features a multi-scale encoder-decoder structure, where separate global and local pathways capture both large-scale geometric structures and fine-grained local details.
arXiv Detail & Related papers (2025-07-07T07:36:03Z) - Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory [72.75478398447396]
We propose Point3R, an online framework targeting dense streaming 3D reconstruction.<n>To be specific, we maintain an explicit spatial pointer memory directly associated with the 3D structure of the current scene.<n>Our method achieves competitive or state-of-the-art performance on various tasks with low training costs.
arXiv Detail & Related papers (2025-07-03T17:59:56Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Anchor-Based Spatial-Temporal Attention Convolutional Networks for
Dynamic 3D Point Cloud Sequences [20.697745449159097]
Anchor-based Spatial-Temporal Attention Convolution operation (ASTAConv) is proposed in this paper to process dynamic 3D point cloud sequences.
The proposed convolution operation builds a regular receptive field around each point by setting several virtual anchors around each point.
The proposed method makes better use of the structured information within the local region, and learn spatial-temporal embedding features from dynamic 3D point cloud sequences.
arXiv Detail & Related papers (2020-12-20T07:35:37Z) - KAPLAN: A 3D Point Descriptor for Shape Completion [80.15764700137383]
KAPLAN is a 3D point descriptor that aggregates local shape information via a series of 2D convolutions.
In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder.
Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.
arXiv Detail & Related papers (2020-07-31T21:56:08Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.