3D VSG: Long-term Semantic Scene Change Prediction through 3D Variable
Scene Graphs
- URL: http://arxiv.org/abs/2209.07896v1
- Date: Fri, 16 Sep 2022 12:41:43 GMT
- Title: 3D VSG: Long-term Semantic Scene Change Prediction through 3D Variable
Scene Graphs
- Authors: Samuel Looper, Javier Rodriguez-Puigvert, Roland Siegwart, Cesar
Cadena, and Lukas Schmid
- Abstract summary: We formalize the task of semantic scene variability estimation.
We identify three main varieties of semantic scene change: changes in the position of an object, its semantic state, or the composition of a scene as a whole.
We present a novel method, DeltaVSG, to estimate the variability of VSGs in a supervised fashion.
- Score: 29.898086255614484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous applications require robots to operate in environments shared with
other agents such as humans or other robots. However, such shared scenes are
typically subject to different kinds of long-term semantic scene changes. The
ability to model and predict such changes is thus crucial for robot autonomy.
In this work, we formalize the task of semantic scene variability estimation
and identify three main varieties of semantic scene change: changes in the
position of an object, its semantic state, or the composition of a scene as a
whole. To represent this variability, we propose the Variable Scene Graph
(VSG), which augments existing 3D Scene Graph (SG) representations with the
variability attribute, representing the likelihood of discrete long-term change
events. We present a novel method, DeltaVSG, to estimate the variability of
VSGs in a supervised fashion. We evaluate our method on the 3RScan long-term
dataset, showing notable improvements in this novel task over existing
approaches. Our method DeltaVSG achieves a precision of 72.2% and recall of
66.8%, often mimicking human intuition about how indoor scenes change over
time. We further show the utility of VSG predictions in the task of active
robotic change detection, speeding up task completion by 62.4% compared to a
scene-change-unaware planner. We make our code available as open-source.
Related papers
- GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots [33.19663755125912]
3D Gaussian Splatting (3DGS) has garnered significant attention in robotics for its explicit, high fidelity dense scene representation.
We propose GS-LTS (Gaussian Splatting for Long-Term Service), a 3DGS-based system enabling indoor robots to manage diverse tasks in dynamic environments over time.
arXiv Detail & Related papers (2025-03-22T11:26:47Z) - Multi-View Pose-Agnostic Change Localization with Zero Labels [4.997375878454274]
We propose a label-free, pose-agnostic change detection method that integrates information from multiple viewpoints.
With as few as 5 images of the post-change scene, our approach can learn an additional change channel in a 3DGS.
Our change-aware 3D scene representation additionally enables the generation of accurate change masks for unseen viewpoints.
arXiv Detail & Related papers (2024-12-05T06:28:54Z) - SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images [125.66499135980344]
We propose SparseGrasp, a novel open-vocabulary robotic grasping system.
SparseGrasp operates efficiently with sparse-view RGB images and handles scene updates fastly.
We show that SparseGrasp significantly outperforms state-of-the-art methods in terms of both speed and adaptability.
arXiv Detail & Related papers (2024-12-03T03:56:01Z) - Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting [27.45827655042124]
We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS)
We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method.
We then extend FisherRF, a next-best-view selection method for 3DGS, to select views and touch poses based on depth uncertainty.
arXiv Detail & Related papers (2024-10-07T01:24:39Z) - 3D scene generation from scene graphs and self-attention [51.49886604454926]
We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans.
We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene.
arXiv Detail & Related papers (2024-04-02T12:26:17Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments [20.890476387720483]
MoRE is a novel approach for multi-object relocalization and reconstruction in evolving environments.
We view these environments as "living scenes" and consider the problem of transforming scans taken at different points in time into a 3D reconstruction of the object instances.
arXiv Detail & Related papers (2023-12-14T17:09:57Z) - Towards a Unified Transformer-based Framework for Scene Graph Generation
and Human-object Interaction Detection [116.21529970404653]
We introduce SG2HOI+, a unified one-step model based on the Transformer architecture.
Our approach employs two interactive hierarchical Transformers to seamlessly unify the tasks of SGG and HOI detection.
Our approach achieves competitive performance when compared to state-of-the-art HOI methods.
arXiv Detail & Related papers (2023-11-03T07:25:57Z) - SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs [81.15889805560333]
We present SG-Bot, a novel rearrangement framework.
SG-Bot exemplifies lightweight, real-time, and user-controllable characteristics.
Experimental results demonstrate that SG-Bot outperforms competitors by a large margin.
arXiv Detail & Related papers (2023-09-21T15:54:33Z) - Modeling Dynamic Environments with Scene Graph Memory [46.587536843634055]
We present a new type of link prediction problem: link prediction on partially observable dynamic graphs.
Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges.
We propose a novel state representation -- Scene Graph Memory (SGM) -- with captures the agent's accumulated set of observations.
We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes.
arXiv Detail & Related papers (2023-05-27T17:39:38Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places,
Objects, and Humans [27.747241700017728]
We present a unified representation for actionable spatial perception: 3D Dynamic Scene Graphs.
3D Dynamic Scene Graphs can have a profound impact on planning and decision-making, human-robot interaction, long-term autonomy, and scene prediction.
arXiv Detail & Related papers (2020-02-15T00:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.