ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding
- URL: http://arxiv.org/abs/2407.00609v1
- Date: Sun, 30 Jun 2024 06:58:04 GMT
- Title: ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding
- Authors: Quang P. M. Pham, Khoi T. N. Nguyen, Lan C. Ngo, Truong Do, Truong Son Hy,
- Abstract summary: This work is the first to implement an Equivariant Graph Neural Network in semantic scene graph generation from 3D point clouds for scene understanding.
Our proposed method, ESGNN, outperforms existing state-of-the-art approaches, demonstrating a significant improvement in scene estimation with faster convergence.
- Score: 2.5165775267615205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene graphs have been proven to be useful for various scene understanding tasks due to their compact and explicit nature. However, existing approaches often neglect the importance of maintaining the symmetry-preserving property when generating scene graphs from 3D point clouds. This oversight can diminish the accuracy and robustness of the resulting scene graphs, especially when handling noisy, multi-view 3D data. This work, to the best of our knowledge, is the first to implement an Equivariant Graph Neural Network in semantic scene graph generation from 3D point clouds for scene understanding. Our proposed method, ESGNN, outperforms existing state-of-the-art approaches, demonstrating a significant improvement in scene estimation with faster convergence. ESGNN demands low computational resources and is easy to implement from available frameworks, paving the way for real-time applications such as robotics and computer vision.
Related papers
- HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting [53.6394928681237]
holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
Our main idea involves the joint optimization of geometry, appearance, semantics, and motion using a combination of static and dynamic 3D Gaussians.
Our approach offers the ability to render new viewpoints in real-time, yielding 2D and 3D semantic information with high accuracy.
arXiv Detail & Related papers (2024-03-19T13:39:05Z) - SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene
Reconstruction [16.643252717745348]
We present SGRec3D, a novel self-supervised pre-training method for 3D scene graph prediction.
Pre-training SGRec3D does not require object relationship labels, making it possible to exploit large-scale 3D scene understanding datasets.
Our experiments demonstrate that in contrast to recent point cloud-based pre-training approaches, our proposed pre-training improves the 3D scene graph prediction considerably.
arXiv Detail & Related papers (2023-09-27T14:45:29Z) - Incremental 3D Semantic Scene Graph Prediction from RGB Sequences [86.77318031029404]
We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence.
Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network.
The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
arXiv Detail & Related papers (2023-05-04T11:32:16Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - MLGCN: An Ultra Efficient Graph Convolution Neural Model For 3D Point
Cloud Analysis [4.947552172739438]
We introduce a novel Multi-level Graph Convolution Neural (MLGCN) model, which uses Graph Neural Networks (GNN) blocks to extract features from 3D point clouds at specific locality levels.
Our approach produces comparable results to those of state-of-the-art models while requiring up to a thousand times fewer floating-point operations (FLOPs) and having significantly reduced storage requirements.
arXiv Detail & Related papers (2023-03-31T00:15:22Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D
Sequences [76.28527350263012]
We propose a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames.
We aggregate PointNet features from primitive scene components by means of a graph neural network.
Our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
arXiv Detail & Related papers (2021-03-27T13:00:36Z) - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions [94.17683799712397]
We focus on scene graphs, a data structure that organizes the entities of a scene in a graph.
We propose a learned method that regresses a scene graph from the point cloud of a scene.
We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
arXiv Detail & Related papers (2020-04-08T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.