Towards Part-Based Understanding of RGB-D Scans
- URL: http://arxiv.org/abs/2012.02094v1
- Date: Thu, 3 Dec 2020 17:30:02 GMT
- Title: Towards Part-Based Understanding of RGB-D Scans
- Authors: Alexey Bokhovkin, Vladislav Ishimtsev, Emil Bogomolov, Denis Zorin,
Alexey Artemov, Evgeny Burnaev, Angela Dai
- Abstract summary: We propose the task of part-based scene understanding of real-world 3D environments.
From an RGB-D scan of a scene, we detect objects, and for each object predict its decomposition into geometric part masks.
We leverage an intermediary part graph representation to enable robust completion as well as building of part priors.
- Score: 43.4094489272776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in 3D semantic scene understanding have shown impressive
progress in 3D instance segmentation, enabling object-level reasoning about 3D
scenes; however, a finer-grained understanding is required to enable
interactions with objects and their functional understanding. Thus, we propose
the task of part-based scene understanding of real-world 3D environments: from
an RGB-D scan of a scene, we detect objects, and for each object predict its
decomposition into geometric part masks, which composed together form the
complete geometry of the observed object. We leverage an intermediary part
graph representation to enable robust completion as well as building of part
priors, which we use to construct the final part mask predictions. Our
experiments demonstrate that guiding part understanding through part graph to
part prior-based predictions significantly outperforms alternative approaches
to the task of semantic part completion.
Related papers
- PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model [19.333506797686695]
We introduce a novel segmentation task known as reasoning part segmentation for 3D objects.
We output a segmentation mask based on complex and implicit textual queries about specific parts of a 3D object.
We propose a model that is capable of segmenting parts of 3D objects based on implicit textual queries and generating natural language explanations.
arXiv Detail & Related papers (2024-04-04T23:38:45Z) - SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Incremental 3D Semantic Scene Graph Prediction from RGB Sequences [86.77318031029404]
We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence.
Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network.
The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
arXiv Detail & Related papers (2023-05-04T11:32:16Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Neural Part Priors: Learning to Optimize Part-Based Object Completion in
RGB-D Scans [27.377128012679076]
We propose to leverage large-scale synthetic datasets of 3D shapes annotated with part information to learn Neural Part Priors.
We can optimize over the learned part priors in order to fit to real-world scanned 3D scenes at test time.
Experiments on the ScanNet dataset demonstrate that NPPs significantly outperforms state of the art in part decomposition and object completion.
arXiv Detail & Related papers (2022-03-17T15:05:44Z) - Semantic Dense Reconstruction with Consistent Scene Segments [33.0310121044956]
A method for dense semantic 3D scene reconstruction from an RGB-D sequence is proposed to solve high-level scene understanding tasks.
First, each RGB-D pair is consistently segmented into 2D semantic maps based on a camera tracking backbone.
A dense 3D mesh model of an unknown environment is incrementally generated from the input RGB-D sequence.
arXiv Detail & Related papers (2021-09-30T03:01:17Z) - Generative 3D Part Assembly via Dynamic Graph Learning [34.108515032411695]
Part assembly is a challenging yet crucial task in 3D computer vision and robotics.
We propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone.
arXiv Detail & Related papers (2020-06-14T04:26:42Z) - 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure
Prior [50.73148041205675]
The goal of the Semantic Scene Completion (SSC) task is to simultaneously predict a completed 3D voxel representation of volumetric occupancy and semantic labels of objects in the scene from a single-view observation.
We propose to devise a new geometry-based strategy to embed depth information with low-resolution voxel representation.
Our proposed geometric embedding works better than the depth feature learning from habitual SSC frameworks.
arXiv Detail & Related papers (2020-03-31T09:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.