Instance-incremental Scene Graph Generation from Real-world Point Clouds
via Normalizing Flows
- URL: http://arxiv.org/abs/2302.10425v2
- Date: Mon, 28 Aug 2023 07:06:35 GMT
- Title: Instance-incremental Scene Graph Generation from Real-world Point Clouds
via Normalizing Flows
- Authors: Chao Qi, Jianqin Yin, Jinghang Xu, and Pengxiang Ding
- Abstract summary: This work introduces a new task of instance-incremental scene graph generation: Given a scene of the point cloud, representing it as a graph and automatically increasing novel instances.
A graph denoting the object layout of the scene is finally generated.
It helps to guide the insertion of novel 3D objects into a real-world scene in vision-based applications like augmented reality.
- Score: 9.4858987199432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces a new task of instance-incremental scene graph
generation: Given a scene of the point cloud, representing it as a graph and
automatically increasing novel instances. A graph denoting the object layout of
the scene is finally generated. It is an important task since it helps to guide
the insertion of novel 3D objects into a real-world scene in vision-based
applications like augmented reality. It is also challenging because the
complexity of the real-world point cloud brings difficulties in learning object
layout experiences from the observation data (non-empty rooms with labeled
semantics). We model this task as a conditional generation problem and propose
a 3D autoregressive framework based on normalizing flows (3D-ANF) to address
it. First, we represent the point cloud as a graph by extracting the label
semantics and contextual relationships. Next, a model based on normalizing
flows is introduced to map the conditional generation of graphic elements into
the Gaussian process. The mapping is invertible. Thus, the real-world
experiences represented in the observation data can be modeled in the training
phase, and novel instances can be autoregressively generated based on the
Gaussian process in the testing phase. To evaluate the performance of our
method sufficiently, we implement this new task on the indoor benchmark dataset
3DSSG-O27R16 and our newly proposed graphical dataset of outdoor scenes GPL3D.
Experiments show that our method generates reliable novel graphs from the
real-world point cloud and achieves state-of-the-art performance on the
datasets.
Related papers
- Open-Vocabulary Octree-Graph for 3D Scene Understanding [54.11828083068082]
Octree-Graph is a novel scene representation for open-vocabulary 3D scene understanding.
An adaptive-octree structure is developed that stores semantics and depicts the occupancy of an object adjustably according to its shape.
arXiv Detail & Related papers (2024-11-25T10:14:10Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Joint Generative Modeling of Scene Graphs and Images via Diffusion
Models [37.788957749123725]
We present a novel generative task: joint scene graph - image generation.
We introduce a novel diffusion model, DiffuseSG, that jointly models the adjacency matrix along with heterogeneous node and edge attributes.
With a graph transformer being the denoiser, DiffuseSG successively denoises the scene graph representation in a continuous space and discretizes the final representation to generate the clean scene graph.
arXiv Detail & Related papers (2024-01-02T10:10:29Z) - CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z) - Incremental 3D Semantic Scene Graph Prediction from RGB Sequences [86.77318031029404]
We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence.
Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network.
The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
arXiv Detail & Related papers (2023-05-04T11:32:16Z) - SGFormer: Semantic Graph Transformer for Point Cloud-based 3D Scene
Graph Generation [46.14140601855313]
We propose a novel model called SGFormer, Semantic Graph TransFormer for point cloud-based 3D scene graph generation.
The task aims to parse a point cloud-based scene into a semantic structural graph, with the core challenge of modeling the complex global structure.
arXiv Detail & Related papers (2023-03-20T11:59:23Z) - SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow [25.577386156273256]
Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations.
We introduce SCOOP, a new method for scene flow estimation that can be learned on a small amount of data without employing ground-truth flow supervision.
arXiv Detail & Related papers (2022-11-25T10:52:02Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Scalable Scene Flow from Point Clouds in the Real World [30.437100097997245]
We introduce a new large scale benchmark for scene flow based on the Open dataset.
We show how previous works were bounded based on the amount of real LiDAR data available.
We introduce the model architecture FastFlow3D that provides real time inference on the full point cloud.
arXiv Detail & Related papers (2021-03-01T20:56:05Z) - Semantic Graph Based Place Recognition for 3D Point Clouds [22.608115489674653]
This paper presents a novel semantic graph based approach for place recognition.
First, we propose a novel semantic graph representation for the point cloud scenes.
We then design a fast and effective graph similarity network to compute the similarity.
arXiv Detail & Related papers (2020-08-26T09:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.