MA3DSG: Multi-Agent 3D Scene Graph Generation for Large-Scale Indoor Environments
- URL: http://arxiv.org/abs/2602.04152v1
- Date: Wed, 04 Feb 2026 02:39:57 GMT
- Title: MA3DSG: Multi-Agent 3D Scene Graph Generation for Large-Scale Indoor Environments
- Authors: Yirum Kim, Jaewoo Kim, Ue-Hwan Kim,
- Abstract summary: We introduce Multi-Agent 3D Scene Graph Generation (MA3DSG) model, the first framework designed to tackle this scalability challenge using multiple agents.<n>We develop a training-free graph alignment algorithm that efficiently merges partial graphs from individual agents into a unified global scene graph.
- Score: 6.071490877668865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current 3D scene graph generation (3DSGG) approaches heavily rely on a single-agent assumption and small-scale environments, exhibiting limited scalability to real-world scenarios. In this work, we introduce Multi-Agent 3D Scene Graph Generation (MA3DSG) model, the first framework designed to tackle this scalability challenge using multiple agents. We develop a training-free graph alignment algorithm that efficiently merges partial query graphs from individual agents into a unified global scene graph. Leveraging extensive analysis and empirical insights, our approach enables conventional single-agent systems to operate collaboratively without requiring any learnable parameters. To rigorously evaluate 3DSGG performance, we propose MA3DSG-Bench-a benchmark that supports diverse agent configurations, domain sizes, and environmental conditions-providing a more general and extensible evaluation framework. This work lays a solid foundation for scalable, multi-agent 3DSGG research.
Related papers
- SGR3 Model: Scene Graph Retrieval-Reasoning Model in 3D [51.32219731589742]
3D scene graphs provide a structured representation of object entities and their relationships.<n>Existing approaches for 3D scene graph generation typically combine scene reconstruction with graph neural networks (GNNs)<n>In this work, we introduce a Scene Graph Retrieval-Reasoning Model in 3D (SGR3 Model)
arXiv Detail & Related papers (2026-03-04T21:19:54Z) - OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z) - KeySG: Hierarchical Keyframe-Based 3D Scene Graphs [1.5134439544218246]
KeySG represents 3D scenes as a hierarchical graph consisting of floors, rooms, objects, and functional elements.<n>We leverage VLM to extract scene information, alleviating the need to explicitly model relationship edges between objects.<n>Our approach can process complex and ambiguous queries while mitigating the scalability issues associated with large scene graphs.
arXiv Detail & Related papers (2025-10-01T15:53:27Z) - Integrating Prior Observations for Incremental 3D Scene Graph Prediction [0.866627581195388]
3D semantic scene graphs (3DSSG) provide compact structured representations of environments by explicitly modeling objects, attributes, and relationships.<n>This paper introduces a novel graph model for incremental 3DSSG prediction that integrates additional, multi-modal information, such as prior observations, directly into the message-passing process.<n>We evaluate our approach on the 3DSSG dataset, showing that GNNs enriched with multi-modal information such as semantic embeddings (e.g., CLIP) and prior observations offer a scalable and generalizable solution for complex, real-world environments.
arXiv Detail & Related papers (2025-09-15T13:10:34Z) - A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation [66.62489208150681]
3D Gaussian Splatting (3DGS) has emerged as a powerful alternative to Neural Radiance Fields (NeRF) for 3D scene representation.<n>This survey provides a comprehensive overview of recent progress in 3DGS applications.
arXiv Detail & Related papers (2025-08-13T17:44:39Z) - SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting [85.87902260102652]
We introduce the novel task of Sequential 3D Gaussian Affordance Reasoning.<n>We then propose SeqSplatNet, an end-to-end framework that directly maps an instruction to a sequence of 3D affordance masks.<n>Our method sets a new state-of-the-art on our challenging benchmark, effectively advancing affordance reasoning from single-step interactions to complex, sequential tasks at the scene level.
arXiv Detail & Related papers (2025-07-31T17:56:55Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - 3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling.
3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task.
Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Explore Contextual Information for 3D Scene Graph Generation [43.66442227874461]
3D scene graph generation (SGG) has been of high interest in computer vision.
We propose a framework fully exploring contextual information for the 3D SGG task.
Our approach achieves superior or competitive performance over previous methods on the 3DSSG dataset.
arXiv Detail & Related papers (2022-10-12T14:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.