Superpixel Transformers for Efficient Semantic Segmentation
- URL: http://arxiv.org/abs/2309.16889v2
- Date: Mon, 2 Oct 2023 21:28:54 GMT
- Title: Superpixel Transformers for Efficient Semantic Segmentation
- Authors: Alex Zihao Zhu, Jieru Mei, Siyuan Qiao, Hang Yan, Yukun Zhu,
Liang-Chieh Chen, Henrik Kretzschmar
- Abstract summary: We propose a solution by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework.
Our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism.
- Score: 32.537400525407186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation, which aims to classify every pixel in an image, is a
key task in machine perception, with many applications across robotics and
autonomous driving. Due to the high dimensionality of this task, most existing
approaches use local operations, such as convolutions, to generate per-pixel
features. However, these methods are typically unable to effectively leverage
global context information due to the high computational costs of operating on
a dense image. In this work, we propose a solution to this issue by leveraging
the idea of superpixels, an over-segmentation of the image, and applying them
with a modern transformer framework. In particular, our model learns to
decompose the pixel space into a spatially low dimensional superpixel space via
a series of local cross-attentions. We then apply multi-head self-attention to
the superpixels to enrich the superpixel features with global context and then
directly produce a class prediction for each superpixel. Finally, we directly
project the superpixel class predictions back into the pixel space using the
associations between the superpixels and the image pixel features. Reasoning in
the superpixel space allows our method to be substantially more computationally
efficient compared to convolution-based decoder methods. Yet, our method
achieves state-of-the-art performance in semantic segmentation due to the rich
superpixel features generated by the global self-attention mechanism. Our
experiments on Cityscapes and ADE20K demonstrate that our method matches the
state of the art in terms of accuracy, while outperforming in terms of model
parameters and latency.
Related papers
- Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Adaptive Superpixel for Active Learning in Semantic Segmentation [34.0733215363568]
We propose a superpixel-based active learning framework, which collects a dominant label per superpixel instead of pixel-wise annotations.
Obtaining a dominant label per superpixel drastically reduces annotators' burden as it requires fewer clicks.
We also devise a sieving mechanism that identifies and excludes potentially noisy annotations from learning.
arXiv Detail & Related papers (2023-03-29T16:07:06Z) - Efficient Multiscale Object-based Superpixel Framework [62.48475585798724]
We propose a novel superpixel framework, named Superpixels through Iterative CLEarcutting (SICLE)
SICLE exploits object information being able to generate a multiscale segmentation on-the-fly.
It generalizes recent superpixel methods, surpassing them and other state-of-the-art approaches in efficiency and effectiveness according to multiple delineation metrics.
arXiv Detail & Related papers (2022-04-07T15:59:38Z) - Saliency Enhancement using Superpixel Similarity [77.34726150561087]
Saliency Object Detection (SOD) has several applications in image analysis.
Deep-learning-based SOD methods are among the most effective, but they may miss foreground parts with similar colors.
We introduce a post-processing method, named textitSaliency Enhancement over Superpixel Similarity (SESS)
We demonstrate that SESS can consistently and considerably improve the results of three deep-learning-based SOD methods on five image datasets.
arXiv Detail & Related papers (2021-12-01T17:22:54Z) - SIN:Superpixel Interpolation Network [9.046310874823002]
Traditional algorithms and deep learning-based algorithms are two main streams in superpixel segmentation.
In this paper, we propose a deep learning-based superpixel segmentation algorithm SIN which can be integrated with downstream tasks in an end-to-end way.
arXiv Detail & Related papers (2021-10-17T02:21:11Z) - Generating Superpixels for High-resolution Images with Decoupled Patch
Calibration [82.21559299694555]
Patch Networks (PCNet) is designed to efficiently and accurately implement high-resolution superpixel segmentation.
DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
arXiv Detail & Related papers (2021-08-19T10:33:05Z) - HERS Superpixels: Deep Affinity Learning for Hierarchical Entropy Rate
Segmentation [0.0]
We propose a two-stage graph-based framework for superpixel segmentation.
In the first stage, we introduce an efficient Deep Affinity Learning network that learns pairwise pixel affinities.
In the second stage, we propose a highly efficient superpixel method called Hierarchical Entropy Rate (HERS)
arXiv Detail & Related papers (2021-06-07T16:20:04Z) - Implicit Integration of Superpixel Segmentation into Fully Convolutional
Networks [11.696069523681178]
We propose a way to implicitly integrate a superpixel scheme into CNNs.
Our proposed method hierarchically groups pixels at downsampling layers and generates superpixels.
We evaluate our method on several tasks such as semantic segmentation, superpixel segmentation, and monocular depth estimation.
arXiv Detail & Related papers (2021-03-05T02:20:26Z) - AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids.
Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z) - Superpixel Segmentation Based on Spatially Constrained Subspace
Clustering [57.76302397774641]
We consider each representative region with independent semantic information as a subspace, and formulate superpixel segmentation as a subspace clustering problem.
We show that a simple integration of superpixel segmentation with the conventional subspace clustering does not effectively work due to the spatial correlation of the pixels.
We propose a novel convex locality-constrained subspace clustering model that is able to constrain the spatial adjacent pixels with similar attributes to be clustered into a superpixel.
arXiv Detail & Related papers (2020-12-11T06:18:36Z) - Superpixel Segmentation with Fully Convolutional Networks [32.878045921919714]
We present a novel method that employs a simple fully convolutional network to predict superpixels on a regular image grid.
Experimental results on benchmark datasets show that our method achieves state-of-the-art superpixel segmentation performance.
We modify a popular network architecture for stereo matching to simultaneously predict superpixels and disparities.
arXiv Detail & Related papers (2020-03-29T02:42:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.