3D Dynamic Scene Graphs: Actionable Spatial Perception with Places,
Objects, and Humans
- URL: http://arxiv.org/abs/2002.06289v2
- Date: Tue, 16 Jun 2020 22:39:39 GMT
- Title: 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places,
Objects, and Humans
- Authors: Antoni Rosinol, Arjun Gupta, Marcus Abate, Jingnan Shi, Luca Carlone
- Abstract summary: We present a unified representation for actionable spatial perception: 3D Dynamic Scene Graphs.
3D Dynamic Scene Graphs can have a profound impact on planning and decision-making, human-robot interaction, long-term autonomy, and scene prediction.
- Score: 27.747241700017728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a unified representation for actionable spatial perception: 3D
Dynamic Scene Graphs. Scene graphs are directed graphs where nodes represent
entities in the scene (e.g. objects, walls, rooms), and edges represent
relations (e.g. inclusion, adjacency) among nodes. Dynamic scene graphs (DSGs)
extend this notion to represent dynamic scenes with moving agents (e.g. humans,
robots), and to include actionable information that supports planning and
decision-making (e.g. spatio-temporal relations, topology at different levels
of abstraction). Our second contribution is to provide the first fully
automatic Spatial PerceptIon eNgine(SPIN) to build a DSG from visual-inertial
data. We integrate state-of-the-art techniques for object and human detection
and pose estimation, and we describe how to robustly infer object, robot, and
human nodes in crowded scenes. To the best of our knowledge, this is the first
paper that reconciles visual-inertial SLAM and dense human mesh tracking.
Moreover, we provide algorithms to obtain hierarchical representations of
indoor environments (e.g. places, structures, rooms) and their relations. Our
third contribution is to demonstrate the proposed spatial perception engine in
a photo-realistic Unity-based simulator, where we assess its robustness and
expressiveness. Finally, we discuss the implications of our proposal on modern
robotics applications. 3D Dynamic Scene Graphs can have a profound impact on
planning and decision-making, human-robot interaction, long-term autonomy, and
scene prediction. A video abstract is available at https://youtu.be/SWbofjhyPzI
Related papers
- 3D scene generation from scene graphs and self-attention [51.49886604454926]
We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans.
We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene.
arXiv Detail & Related papers (2024-04-02T12:26:17Z) - CIRCLE: Capture In Rich Contextual Environments [69.97976304918149]
We propose a novel motion acquisition system in which the actor perceives and operates in a highly contextual virtual world.
We present CIRCLE, a dataset containing 10 hours of full-body reaching motion from 5 subjects across nine scenes.
We use this dataset to train a model that generates human motion conditioned on scene information.
arXiv Detail & Related papers (2023-03-31T09:18:12Z) - Human-Aware Object Placement for Visual Environment Reconstruction [63.14733166375534]
We show that human-scene interactions can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.
Our key idea is that, as a person moves through a scene and interacts with it, we accumulate HSIs across multiple input images.
We show that our scene reconstruction can be used to refine the initial 3D human pose and shape estimation.
arXiv Detail & Related papers (2022-03-07T18:59:02Z) - Situational Graphs for Robot Navigation in Structured Indoor
Environments [9.13466172688693]
We present a real-time online built Situational Graphs (S-Graphs) composed of a single graph representing the environment.
Our method utilizes odometry readings and planar surfaces extracted from 3D LiDAR scans, to construct and optimize in real-time a three layered S-Graph.
Our proposal does not only demonstrate state-of-the-art results for pose estimation of the robot, but also contributes with a metric-semantic-topological model of the environment.
arXiv Detail & Related papers (2022-02-24T16:59:06Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs [20.960087818959206]
Humans are able to form a complex mental model of the environment they move in.
Current robots' internal representations still provide a partial and fragmented understanding of the environment.
This paper introduces a novel representation, a 3D Dynamic Scene Graph.
arXiv Detail & Related papers (2021-01-18T06:17:52Z) - Learning 3D Dynamic Scene Representations for Robot Manipulation [21.6131570689398]
3D scene representation for robot manipulation should capture three key object properties: permanency, completeness, and continuity.
We introduce 3D Dynamic Representation (DSR), a 3D scene representation that simultaneously discovers, tracks, reconstructs objects, and predicts their dynamics.
We propose DSR-Net, which learns to aggregate visual observations over multiple interactions to gradually build and refine DSR.
arXiv Detail & Related papers (2020-11-03T19:23:06Z) - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions [94.17683799712397]
We focus on scene graphs, a data structure that organizes the entities of a scene in a graph.
We propose a learned method that regresses a scene graph from the point cloud of a scene.
We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
arXiv Detail & Related papers (2020-04-08T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.