SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A
Learnable Scene Descriptor
- URL: http://arxiv.org/abs/2001.09087v1
- Date: Fri, 24 Jan 2020 16:53:30 GMT
- Title: SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A
Learnable Scene Descriptor
- Authors: Jiachen Xu, Jingyu Gong, Jie Zhou, Xin Tan, Yuan Xie, Lizhuang Ma
- Abstract summary: We propose a SceneEncoder module to impose a scene-aware guidance to enhance the effect of global information.
The module predicts a scene descriptor, which learns to represent the categories of objects existing in the scene.
We also design a region similarity loss to propagate distinguishing features to their own neighboring points with the same label.
- Score: 51.298760338410624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Besides local features, global information plays an essential role in
semantic segmentation, while recent works usually fail to explicitly extract
the meaningful global information and make full use of it. In this paper, we
propose a SceneEncoder module to impose a scene-aware guidance to enhance the
effect of global information. The module predicts a scene descriptor, which
learns to represent the categories of objects existing in the scene and
directly guides the point-level semantic segmentation through filtering out
categories not belonging to this scene. Additionally, to alleviate segmentation
noise in local region, we design a region similarity loss to propagate
distinguishing features to their own neighboring points with the same label,
leading to the enhancement of the distinguishing ability of point-wise
features. We integrate our methods into several prevailing networks and conduct
extensive experiments on benchmark datasets ScanNet and ShapeNet. Results show
that our methods greatly improve the performance of baselines and achieve
state-of-the-art performance.
Related papers
- ReferEverything: Towards Segmenting Everything We Can Speak of in Videos [42.88584315033116]
We present REM, a framework for segmenting concepts in video that can be described through natural language.
Our method capitalizes on visual representations learned by video diffusion models on Internet-scale datasets.
arXiv Detail & Related papers (2024-10-30T17:59:26Z) - Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification [0.5572976467442564]
The work described in this paper uses both semantic information, obtained from object detection, and semantic segmentation techniques.
A novel approach that uses a semantic segmentation mask to provide Hu-moments-based segmentation categories' shape characterization, designated by Hu-Moments Features (SHMFs) is proposed.
A three-main-branch network, designated by GOS$2$F$2$App, that exploits deep-learning-based global features, object-based features, and semantic segmentation-based features is also proposed.
arXiv Detail & Related papers (2024-04-11T13:37:51Z) - Mapping High-level Semantic Regions in Indoor Environments without
Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments.
To enable region identification, the method uses a vision-to-language model to provide scene information for mapping.
By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z) - Fully and Weakly Supervised Referring Expression Segmentation with
End-to-End Learning [50.40482222266927]
Referring Expression (RES) is aimed at localizing and segmenting the target according to the given language expression.
We propose a parallel position- kernel-segmentation pipeline to better isolate and then interact with the localization and segmentation steps.
Our method is simple but surprisingly effective, outperforming all previous state-of-the-art RES methods on fully- and weakly-supervised settings.
arXiv Detail & Related papers (2022-12-17T08:29:33Z) - Framework-agnostic Semantically-aware Global Reasoning for Segmentation [29.69187816377079]
We propose a component that learns to project image features into latent representations and reason between them.
Our design encourages the latent regions to represent semantic concepts by ensuring that the activated regions are spatially disjoint.
Our latent tokens are semantically interpretable and diverse and provide a rich set of features that can be transferred to downstream tasks.
arXiv Detail & Related papers (2022-12-06T21:42:05Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Weakly-Supervised Semantic Segmentation via Sub-category Exploration [73.03956876752868]
We propose a simple yet effective approach to enforce the network to pay attention to other parts of an object.
Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class.
We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2020-08-03T20:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.