SpaceMeshLab: Spatial Context Memoization and Meshgrid Atrous
Convolution Consensus for Semantic Segmentation
- URL: http://arxiv.org/abs/2106.04025v1
- Date: Tue, 8 Jun 2021 00:38:02 GMT
- Title: SpaceMeshLab: Spatial Context Memoization and Meshgrid Atrous
Convolution Consensus for Semantic Segmentation
- Authors: Taehun Kim, Jinseong Kim, Daijin Kim
- Abstract summary: We propose a bypassing branch for spatial context by retaining the input dimension and constantly communicating its spatial context.
We also propose Meshgrid Atrous Convolution Consensus (MetroCon2) which brings multi-scale scheme into fine-grained multi-scale object context.
- Score: 11.571698215510152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation networks adopt transfer learning from image
classification networks which occurs a shortage of spatial context information.
For this reason, we propose Spatial Context Memoization (SpaM), a bypassing
branch for spatial context by retaining the input dimension and constantly
communicating its spatial context and rich semantic information mutually with
the backbone network. Multi-scale context information for semantic segmentation
is crucial for dealing with diverse sizes and shapes of target objects in the
given scene. Conventional multi-scale context scheme adopts multiple effective
receptive fields by multiple dilation rates or pooling operations, but often
suffer from misalignment problem with respect to the target pixel. To this end,
we propose Meshgrid Atrous Convolution Consensus (MetroCon^2) which brings
multi-scale scheme into fine-grained multi-scale object context using
convolutions with meshgrid-like scattered dilation rates. SpaceMeshLab
(ResNet-101 + SpaM + MetroCon^2) achieves 82.0% mIoU in Cityscapes test and
53.5% mIoU on Pascal-Context validation set.
Related papers
- A Deep Semantic Segmentation Network with Semantic and Contextual Refinements [11.755865577258767]
This paper introduces a Semantic Refinement Module (SRM) to address this issue within the segmentation network.
A Contextual Refinement Module (CRM) is presented to capture global context information across both spatial and channel dimensions.
The efficacy of these proposed modules is validated on three widely used datasets-Cityscapes, Bdd100K, and ADE20K-demonstrating superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-12-11T03:40:46Z) - MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Semantic Segmentation [33.67313662538398]
We propose a multi-resolution training framework for open-vocabulary semantic segmentation with a single pretrained CLIP backbone.
MROVSeg uses sliding windows to slice the high-resolution input into uniform patches, each matching the input size of the well-trained image encoder.
We demonstrate the superiority of MROVSeg on well-established open-vocabulary semantic segmentation benchmarks.
arXiv Detail & Related papers (2024-08-27T04:45:53Z) - Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Framework-agnostic Semantically-aware Global Reasoning for Segmentation [29.69187816377079]
We propose a component that learns to project image features into latent representations and reason between them.
Our design encourages the latent regions to represent semantic concepts by ensuring that the activated regions are spatially disjoint.
Our latent tokens are semantically interpretable and diverse and provide a rich set of features that can be transferred to downstream tasks.
arXiv Detail & Related papers (2022-12-06T21:42:05Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - Multi-layer Feature Aggregation for Deep Scene Parsing Models [19.198074549944568]
In this paper, we explore the effective use of multi-layer feature outputs of the deep parsing networks for spatial-semantic consistency.
The proposed module can auto-select the intermediate visual features to correlate the spatial and semantic information.
Experiments on four public scene parsing datasets prove that the deep parsing network equipped with the proposed feature aggregation module can achieve very promising results.
arXiv Detail & Related papers (2020-11-04T23:07:07Z) - Affinity Space Adaptation for Semantic Segmentation Across Domains [57.31113934195595]
In this paper, we address the problem of unsupervised domain adaptation (UDA) in semantic segmentation.
Motivated by the fact that source and target domain have invariant semantic structures, we propose to exploit such invariance across domains.
We develop two affinity space adaptation strategies: affinity space cleaning and adversarial affinity space alignment.
arXiv Detail & Related papers (2020-09-26T10:28:11Z) - A Multi-Level Approach to Waste Object Segmentation [10.20384144853726]
We address the problem of localizing waste objects from a color image and an optional depth image.
Our method integrates the intensity and depth information at multiple levels of spatial granularity.
We create a new RGBD waste object segmentation, MJU-Waste, that is made public to facilitate future research in this area.
arXiv Detail & Related papers (2020-07-08T16:49:25Z) - Split-Merge Pooling [36.2980225204665]
Split-Merge pooling is introduced to preserve spatial information without subsampling.
We evaluate our approach for dense semantic segmentation of large image sizes taken from the Cityscapes and GTA-5 datasets.
arXiv Detail & Related papers (2020-06-13T23:20:30Z) - Gated Path Selection Network for Semantic Segmentation [72.44994579325822]
We develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields.
In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields.
To dynamically select desirable semantic context, a gate prediction module is further introduced.
arXiv Detail & Related papers (2020-01-19T12:32:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.