Multi-Scale Semantic Segmentation with Modified MBConv Blocks
- URL: http://arxiv.org/abs/2402.04618v1
- Date: Wed, 7 Feb 2024 07:01:08 GMT
- Title: Multi-Scale Semantic Segmentation with Modified MBConv Blocks
- Authors: Xi Chen, Yang Cai, Yuan Wu, Bo Xiong, Taesung Park
- Abstract summary: We introduce a novel adaptation of MBConv blocks specifically tailored for semantic segmentation.
By implementing these changes, our approach achieves impressive mean Intersection over Union (IoU) scores of 84.5% and 84.0% on the Cityscapes test and validation datasets.
- Score: 29.026787888644474
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, MBConv blocks, initially designed for efficiency in
resource-limited settings and later adapted for cutting-edge image
classification performances, have demonstrated significant potential in image
classification tasks. Despite their success, their application in semantic
segmentation has remained relatively unexplored. This paper introduces a novel
adaptation of MBConv blocks specifically tailored for semantic segmentation.
Our modification stems from the insight that semantic segmentation requires the
extraction of more detailed spatial information than image classification. We
argue that to effectively perform multi-scale semantic segmentation, each
branch of a U-Net architecture, regardless of its resolution, should possess
equivalent segmentation capabilities. By implementing these changes, our
approach achieves impressive mean Intersection over Union (IoU) scores of 84.5%
and 84.0% on the Cityscapes test and validation datasets, respectively,
demonstrating the efficacy of our proposed modifications in enhancing semantic
segmentation performance.
Related papers
- SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation [11.176993272867396]
In this paper, we propose a novel Semantic and Spatial Adaptive (SSA-Seg) to address the challenges of semantic segmentation.
Specifically, we employ the coarse masks obtained from the fixed prototypes as a guide to adjust the fixed prototype towards the center of the semantic and spatial domains in the test image.
Results show that the proposed SSA-Seg significantly improves the segmentation performance of the baseline models with only a minimal increase in computational cost.
arXiv Detail & Related papers (2024-05-10T15:14:23Z) - Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and
Local Consensus Guided Cross Attention [7.939095881813804]
Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided.
We introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects.
The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images.
arXiv Detail & Related papers (2024-01-18T10:29:10Z) - Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic
Segmentation [34.257289290796315]
We propose the Relevant Intrinsic Feature Enhancement Network (RiFeNet) to improve semantic consistency of foreground instances.
RiFeNet surpasses the state-of-the-art methods on PASCAL-5i and COCO benchmarks.
arXiv Detail & Related papers (2023-12-11T16:02:57Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Attention-Guided Supervised Contrastive Learning for Semantic
Segmentation [16.729068267453897]
In a per-pixel prediction task, more than one label can exist in a single image for segmentation.
We propose an attention-guided supervised contrastive learning approach to highlight a single semantic object every time as the target.
arXiv Detail & Related papers (2021-06-03T05:01:11Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Semantic Distribution-aware Contrastive Adaptation for Semantic
Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain.
We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment.
We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z) - Affinity Space Adaptation for Semantic Segmentation Across Domains [57.31113934195595]
In this paper, we address the problem of unsupervised domain adaptation (UDA) in semantic segmentation.
Motivated by the fact that source and target domain have invariant semantic structures, we propose to exploit such invariance across domains.
We develop two affinity space adaptation strategies: affinity space cleaning and adversarial affinity space alignment.
arXiv Detail & Related papers (2020-09-26T10:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.