Related papers: Semantic Layering in Room Segmentation via LLMs

Semantic Layering in Room Segmentation via LLMs

URL: http://arxiv.org/abs/2403.12920v1
Date: Tue, 19 Mar 2024 17:23:44 GMT
Title: Semantic Layering in Room Segmentation via LLMs
Authors: Taehyeon Kim, Byung-Cheol Min,
Abstract summary: SeLRoS is an advanced method for semantic room segmentation by integrating Large Language Models (LLMs) with traditional 2D map-based segmentation. We provide a novel framework that interprets and organizes complex information about each segmented area. The effectiveness of SeLRoS is verified through its application across 30 different 3D environments.
Score: 4.3154860982892425
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we introduce Semantic Layering in Room Segmentation via LLMs (SeLRoS), an advanced method for semantic room segmentation by integrating Large Language Models (LLMs) with traditional 2D map-based segmentation. Unlike previous approaches that solely focus on the geometric segmentation of indoor environments, our work enriches segmented maps with semantic data, including object identification and spatial relationships, to enhance robotic navigation. By leveraging LLMs, we provide a novel framework that interprets and organizes complex information about each segmented area, thereby improving the accuracy and contextual relevance of room segmentation. Furthermore, SeLRoS overcomes the limitations of existing algorithms by using a semantic evaluation method to accurately distinguish true room divisions from those erroneously generated by furniture and segmentation inaccuracies. The effectiveness of SeLRoS is verified through its application across 30 different 3D environments. Source code and experiment videos for this work are available at: https://sites.google.com/view/selros.

Related papers

X-SAM: From Segment Anything to Any Segmentation [63.79182974315084]
Large Language Models (LLMs) demonstrate strong capabilities in broad knowledge representation, yet they are inherently deficient in pixel-level perceptual understanding.<n>We present X-SAM, a streamlined Multimodal Large Language Model framework that extends the segmentation paradigm from textitsegment anything to textitany segmentation.<n>We propose a new segmentation task, termed Visual GrounDed (VGD) segmentation, which segments all instance objects with interactive visual prompts and empowers MLLMs with visual grounded, pixel-wise interpretative capabilities.
arXiv Detail & Related papers (2025-08-06T17:19:10Z)
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation [87.30919771444117]
Reasoning segmentation aims to segment target objects in complex scenes based on human intent and spatial reasoning. Recent multimodal large language models (MLLMs) have demonstrated impressive 2D image reasoning segmentation. We introduce MLLM-For3D, a framework that transfers knowledge from 2D MLLMs to 3D scene understanding.
arXiv Detail & Related papers (2025-03-23T16:40:20Z)
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields [49.66011190843893]
We propose a method that leverages CLIP feature distillation, achieving efficient 3D segmentation through language guidance. To achieve this, we introduce an adapter module and mitigate the noise issue in the dense CLIP feature distillation process. Our method surpasses current state-of-the-art technologies in both training speed and performance.
arXiv Detail & Related papers (2025-01-31T12:19:14Z)
Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation [0.0]
LangSeg is a novel semantic segmentation method that leverages context-sensitive, fine-grained subclass descriptors. We evaluate LangSeg on two challenging datasets, ADE20K and COCO-Stuff, where it outperforms state-of-the-art models.
arXiv Detail & Related papers (2025-01-27T20:02:12Z)
SegPoint: Segment Any Point Cloud via Large Language Model [62.69797122055389]
We propose a model, called SegPoint, to produce point-wise segmentation masks across a diverse range of tasks. SegPoint is the first model to address varied segmentation tasks within a single framework.
arXiv Detail & Related papers (2024-07-18T17:58:03Z)
Frequency-based Matcher for Long-tailed Semantic Segmentation [22.199174076366003]
We focus on a relatively under-explored task setting, long-tailed semantic segmentation (LTSS) We propose a dual-metric evaluation system and construct the LTSS benchmark to demonstrate the performance of semantic segmentation methods and long-tailed solutions. We also propose a transformer-based algorithm to improve LTSS, frequency-based matcher, which solves the oversuppression problem by one-to-many matching.
arXiv Detail & Related papers (2024-06-06T09:57:56Z)
Training-Free Semantic Segmentation via LLM-Supervision [37.9007813884699]
This paper introduces a new approach to text-supervised semantic segmentation using supervision by a large language model (LLM) Our method starts from an LLM to generate a detailed set of subclasses for more accurate class representation. We then employ an advanced text-supervised semantic segmentation model to apply the generated subclasses as target labels.
arXiv Detail & Related papers (2024-03-31T14:37:25Z)
SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach. Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations. Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z)
De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects. We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding. We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z)
Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations. Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z)
Robust 3D Scene Segmentation through Hierarchical and Learnable Part-Fusion [9.275156524109438]
3D semantic segmentation is a fundamental building block for several scene understanding applications such as autonomous driving, robotics and AR/VR. Previous methods have utilized hierarchical, iterative methods to fuse semantic and instance information, but they lack learnability in context fusion. This paper presents Segment-Fusion, a novel attention-based method for hierarchical fusion of semantic and instance information.
arXiv Detail & Related papers (2021-11-16T13:14:47Z)
CTNet: Context-based Tandem Network for Semantic Segmentation [77.4337867789772]
This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information. To further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated.
arXiv Detail & Related papers (2021-04-20T07:33:11Z)
Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning [5.699350798684963]
We propose a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning. For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects. We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.
arXiv Detail & Related papers (2020-07-07T02:17:44Z)
SASO: Joint 3D Semantic-Instance Segmentation via Multi-scale Semantic Association and Salient Point Clustering Optimization [8.519716460338518]
We propose a novel 3D point cloud segmentation framework named SASO, which jointly performs semantic and instance segmentation tasks. For semantic segmentation task, inspired by the inherent correlation among objects in spatial context, we propose a Multi-scale Semantic Association (MSA) module. For instance segmentation task, different from previous works that utilize clustering only in inference procedure, we propose a Salient Point Clustering Optimization (SPCO) module.
arXiv Detail & Related papers (2020-06-25T08:55:25Z)
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.