Incremental 3D Semantic Scene Graph Prediction from RGB Sequences
- URL: http://arxiv.org/abs/2305.02743v2
- Date: Sat, 6 May 2023 20:15:08 GMT
- Title: Incremental 3D Semantic Scene Graph Prediction from RGB Sequences
- Authors: Shun-Cheng Wu, Keisuke Tateno, Nassir Navab, Federico Tombari
- Abstract summary: We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence.
Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network.
The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
- Score: 86.77318031029404
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: 3D semantic scene graphs are a powerful holistic representation as they
describe the individual objects and depict the relation between them. They are
compact high-level graphs that enable many tasks requiring scene reasoning. In
real-world settings, existing 3D estimation methods produce robust predictions
that mostly rely on dense inputs. In this work, we propose a real-time
framework that incrementally builds a consistent 3D semantic scene graph of a
scene given an RGB image sequence. Our method consists of a novel incremental
entity estimation pipeline and a scene graph prediction network. The proposed
pipeline simultaneously reconstructs a sparse point map and fuses entity
estimation from the input images. The proposed network estimates 3D semantic
scene graphs with iterative message passing using multi-view and geometric
features extracted from the scene entities. Extensive experiments on the 3RScan
dataset show the effectiveness of the proposed method in this challenging task,
outperforming state-of-the-art approaches.
Related papers
- ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding [2.5165775267615205]
This work is the first to implement an Equivariant Graph Neural Network in semantic scene graph generation from 3D point clouds for scene understanding.
Our proposed method, ESGNN, outperforms existing state-of-the-art approaches, demonstrating a significant improvement in scene estimation with faster convergence.
arXiv Detail & Related papers (2024-06-30T06:58:04Z) - ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and
Planning [125.90002884194838]
ConceptGraphs is an open-vocabulary graph-structured representation for 3D scenes.
It is built by leveraging 2D foundation models and fusing their output to 3D by multi-view association.
We demonstrate the utility of this representation through a number of downstream planning tasks.
arXiv Detail & Related papers (2023-09-28T17:53:38Z) - SGRec3D: Self-Supervised 3D Scene Graph Learning via Object-Level Scene
Reconstruction [16.643252717745348]
We present SGRec3D, a novel self-supervised pre-training method for 3D scene graph prediction.
Pre-training SGRec3D does not require object relationship labels, making it possible to exploit large-scale 3D scene understanding datasets.
Our experiments demonstrate that in contrast to recent point cloud-based pre-training approaches, our proposed pre-training improves the 3D scene graph prediction considerably.
arXiv Detail & Related papers (2023-09-27T14:45:29Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D
Sequences [76.28527350263012]
We propose a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames.
We aggregate PointNet features from primitive scene components by means of a graph neural network.
Our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
arXiv Detail & Related papers (2021-03-27T13:00:36Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z) - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions [94.17683799712397]
We focus on scene graphs, a data structure that organizes the entities of a scene in a graph.
We propose a learned method that regresses a scene graph from the point cloud of a scene.
We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
arXiv Detail & Related papers (2020-04-08T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.