Metric-Semantic Factor Graph Generation based on Graph Neural Networks
- URL: http://arxiv.org/abs/2409.11972v1
- Date: Wed, 18 Sep 2024 13:24:44 GMT
- Title: Metric-Semantic Factor Graph Generation based on Graph Neural Networks
- Authors: Jose Andres Millan-Romera, Hriday Bavle, Muhammad Shaheer, Holger Voos, Jose Luis Sanchez-Lopez,
- Abstract summary: In indoors, certain spatial constraints, such as the relative positioning of planes, remain consistent despite variations in layout.
This paper explores how these invariant relationships can be captured in a graph SLAM framework by representing high-level concepts like rooms and walls.
Several efforts have tackled this issue with add-hoc solutions for each concept generation and with manually-defined factors.
This paper proposes a novel method for metric-semantic factor graph generation which includes defining a semantic scene graph, integrating geometric information, and learning the interconnecting factors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the relationships between geometric structures and semantic concepts is crucial for building accurate models of complex environments. In indoors, certain spatial constraints, such as the relative positioning of planes, remain consistent despite variations in layout. This paper explores how these invariant relationships can be captured in a graph SLAM framework by representing high-level concepts like rooms and walls, linking them to geometric elements like planes through an optimizable factor graph. Several efforts have tackled this issue with add-hoc solutions for each concept generation and with manually-defined factors. This paper proposes a novel method for metric-semantic factor graph generation which includes defining a semantic scene graph, integrating geometric information, and learning the interconnecting factors, all based on Graph Neural Networks (GNNs). An edge classification network (G-GNN) sorts the edges between planes into same room, same wall or none types. The resulting relations are clustered, generating a room or wall for each cluster. A second family of networks (F-GNN) infers the geometrical origin of the new nodes. The definition of the factors employs the same F-GNN used for the metric attribute of the generated nodes. Furthermore, share the new factor graph with the S-Graphs+ algorithm, extending its graph expressiveness and scene representation with the ultimate goal of improving the SLAM performance. The complexity of the environments is increased to N-plane rooms by training the networks on L-shaped rooms. The framework is evaluated in synthetic and simulated scenarios as no real datasets of the required complex layouts are available.
Related papers
- SegSplat: Feed-forward Gaussian Splatting and Open-Set Semantic Segmentation [114.57192386025373]
SegSplat is a novel framework designed to bridge the gap between rapid, feed-forward 3D reconstruction and rich, open-vocabulary semantic understanding.<n>This work represents a significant step towards practical, on-the-fly generation of semantically aware 3D environments.
arXiv Detail & Related papers (2025-11-23T10:26:38Z) - GrOCE:Graph-Guided Online Concept Erasure for Text-to-Image Diffusion Models [24.278300091974085]
Concept erasure aims to remove harmful, inappropriate, or copyrighted content from text-to-image diffusion models.<n>We propose Graph-Guided Online Concept Erasure (GrOCE), a training-free framework that performs precise and adaptive concept removal.
arXiv Detail & Related papers (2025-11-17T04:47:16Z) - Object-Centric Representation Learning for Enhanced 3D Scene Graph Prediction [3.7471945679132594]
3D Semantic Scene Graph Prediction aims to detect objects and their semantic relationships in 3D scenes.<n>Previous research has addressed dataset limitations and explored various approaches including Open-Vocabulary settings.<n>We demonstrate through extensive analysis that the quality of object features plays a critical role in determining overall scene graph accuracy.
arXiv Detail & Related papers (2025-10-06T11:33:09Z) - Hierarchical Neural Semantic Representation for 3D Semantic Correspondence [72.8101601086805]
We design the hierarchical neural semantic representation (HNSR), which consists of a global semantic feature to capture high-level structure and multi-resolution local geometric features.<n>Second, we design a progressive global-to-local matching strategy, which establishes coarse semantic correspondence using the global semantic feature.<n>Third, our framework is training-free and broadly compatible with various pre-trained 3D generative backbones, demonstrating strong generalization across diverse shape categories.
arXiv Detail & Related papers (2025-09-22T07:23:07Z) - Graph Concept Bottleneck Models [26.57626285653119]
Concept Bottleneck Models (CBMs) provide explicit interpretations for deep neural networks through concepts.<n>We propose GraphCBMs: a new variant of CBM that facilitates concept relationships by constructing latent concept graphs.
arXiv Detail & Related papers (2025-08-19T20:23:18Z) - Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition [9.411542547451193]
We propose a novel framework that redefines 3D place recognition through density-agnostic geometric reasoning.<n>Specifically, we introduce an implicit 3D representation based on elastic points, which is immune to the interference of original scene point cloud density.<n>With the aid of these two types of information, we obtain descriptors that fuse geometric information from both bird's-eye view and 3D segment perspectives.
arXiv Detail & Related papers (2025-06-17T07:04:07Z) - Interpretable 3D Neural Object Volumes for Robust Conceptual Reasoning [68.3379650993108]
CAVE - Concept Aware Volumes for Explanations - is a new direction that unifies interpretability and robustness in image classification.<n>We propose 3D Consistency (3D-C), a metric to measure spatial consistency of concepts.<n>CAVE achieves competitive classification performance while discovering consistent and meaningful concepts across images in various OOD settings.
arXiv Detail & Related papers (2025-03-17T17:55:15Z) - ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions [91.55655961014027]
3D semantic occupancy and flow prediction are fundamental to understanding scene scene.<n>This paper proposes a vision-based framework with three targeted improvements.<n>Our purely convolutional architecture establishes new SOTA performance on multiple benchmarks for both semantic occupancy and joint semantic-flow prediction.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - Structural Causality-based Generalizable Concept Discovery Models [29.932706137805713]
We propose a disentanglement mechanism utilizing a variational autoencoder (VAE) for learning mutually independent generative factors for a given dataset.
Our method assumes generative factors and concepts to form a bipartite graph, with directed causal edges from generative factors to concepts.
On specific downstream tasks, our proposed method successfully learns task-specific concepts which are explained well by the causal edges from the generative factors.
arXiv Detail & Related papers (2024-10-20T20:09:47Z) - Graph Pooling via Ricci Flow [1.1126342180866644]
We introduce a graph pooling operator (ORC-Pool) which utilizes a characterization of the graph's geometry via Ollivier's discrete Ricci curvature and an associated geometric flow.
ORC-Pool extends such clustering approaches to attributed graphs, allowing for the integration of geometric coarsening into Graph Neural Networks as a pooling layer.
arXiv Detail & Related papers (2024-07-05T03:26:37Z) - LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering [59.89626219328127]
Graph clustering is a fundamental problem in machine learning.
Deep learning methods achieve the state-of-the-art results in recent years, but they still cannot work without predefined cluster numbers.
We propose to address this problem from a fresh perspective of graph information theory.
arXiv Detail & Related papers (2024-05-20T05:46:41Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - Graph Transformer GANs for Graph-Constrained House Generation [223.739067413952]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The GTGAN learns effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.
arXiv Detail & Related papers (2023-03-14T20:35:45Z) - Graph Neural Networks with Feature and Structure Aware Random Walk [7.143879014059894]
We show that in typical heterphilous graphs, the edges may be directed, and whether to treat the edges as is or simply make them undirected greatly affects the performance of the GNN models.
We develop a model that adaptively learns the directionality of the graph, and exploits the underlying long-distance correlations between nodes.
arXiv Detail & Related papers (2021-11-19T08:54:21Z) - Factorizable Graph Convolutional Networks [90.59836684458905]
We introduce a novel graph convolutional network (GCN) that explicitly disentangles intertwined relations encoded in a graph.
FactorGCN takes a simple graph as input, and disentangles it into several factorized graphs.
We evaluate the proposed FactorGCN both qualitatively and quantitatively on the synthetic and real-world datasets.
arXiv Detail & Related papers (2020-10-12T03:01:40Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z) - Geom-GCN: Geometric Graph Convolutional Networks [15.783571061254847]
We propose a novel geometric aggregation scheme for graph neural networks to overcome the two weaknesses.
The proposed aggregation scheme is permutation-invariant and consists of three modules, node embedding, structural neighborhood, and bi-level aggregation.
We also present an implementation of the scheme in graph convolutional networks, termed Geom-GCN, to perform transductive learning on graphs.
arXiv Detail & Related papers (2020-02-13T00:03:09Z) - EdgeNets:Edge Varying Graph Neural Networks [179.99395949679547]
This paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet.
An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors.
This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs)
arXiv Detail & Related papers (2020-01-21T15:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.