Related papers: Superpixel Transformers for Efficient Semantic Segmentation

Superpixel Transformers for Efficient Semantic Segmentation

URL: http://arxiv.org/abs/2309.16889v2
Date: Mon, 2 Oct 2023 21:28:54 GMT
Title: Superpixel Transformers for Efficient Semantic Segmentation
Authors: Alex Zihao Zhu, Jieru Mei, Siyuan Qiao, Hang Yan, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar
Abstract summary: We propose a solution by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework. Our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism.
Score: 32.537400525407186
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic segmentation, which aims to classify every pixel in an image, is a key task in machine perception, with many applications across robotics and autonomous driving. Due to the high dimensionality of this task, most existing approaches use local operations, such as convolutions, to generate per-pixel features. However, these methods are typically unable to effectively leverage global context information due to the high computational costs of operating on a dense image. In this work, we propose a solution to this issue by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework. In particular, our model learns to decompose the pixel space into a spatially low dimensional superpixel space via a series of local cross-attentions. We then apply multi-head self-attention to the superpixels to enrich the superpixel features with global context and then directly produce a class prediction for each superpixel. Finally, we directly project the superpixel class predictions back into the pixel space using the associations between the superpixels and the image pixel features. Reasoning in the superpixel space allows our method to be substantially more computationally efficient compared to convolution-based decoder methods. Yet, our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism. Our experiments on Cityscapes and ADE20K demonstrate that our method matches the state of the art in terms of accuracy, while outperforming in terms of model parameters and latency.

Related papers

Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Adaptive Superpixel for Active Learning in Semantic Segmentation [34.0733215363568]
We propose a superpixel-based active learning framework, which collects a dominant label per superpixel instead of pixel-wise annotations. Obtaining a dominant label per superpixel drastically reduces annotators' burden as it requires fewer clicks. We also devise a sieving mechanism that identifies and excludes potentially noisy annotations from learning.
arXiv Detail & Related papers (2023-03-29T16:07:06Z)
Efficient Multiscale Object-based Superpixel Framework [62.48475585798724]
We propose a novel superpixel framework, named Superpixels through Iterative CLEarcutting (SICLE) SICLE exploits object information being able to generate a multiscale segmentation on-the-fly. It generalizes recent superpixel methods, surpassing them and other state-of-the-art approaches in efficiency and effectiveness according to multiple delineation metrics.
arXiv Detail & Related papers (2022-04-07T15:59:38Z)
Saliency Enhancement using Superpixel Similarity [77.34726150561087]
Saliency Object Detection (SOD) has several applications in image analysis. Deep-learning-based SOD methods are among the most effective, but they may miss foreground parts with similar colors. We introduce a post-processing method, named textitSaliency Enhancement over Superpixel Similarity (SESS) We demonstrate that SESS can consistently and considerably improve the results of three deep-learning-based SOD methods on five image datasets.
arXiv Detail & Related papers (2021-12-01T17:22:54Z)
SIN:Superpixel Interpolation Network [9.046310874823002]
Traditional algorithms and deep learning-based algorithms are two main streams in superpixel segmentation. In this paper, we propose a deep learning-based superpixel segmentation algorithm SIN which can be integrated with downstream tasks in an end-to-end way.
arXiv Detail & Related papers (2021-10-17T02:21:11Z)
Generating Superpixels for High-resolution Images with Decoupled Patch Calibration [82.21559299694555]
Patch Networks (PCNet) is designed to efficiently and accurately implement high-resolution superpixel segmentation. DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries. In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
arXiv Detail & Related papers (2021-08-19T10:33:05Z)
HERS Superpixels: Deep Affinity Learning for Hierarchical Entropy Rate Segmentation [0.0]
We propose a two-stage graph-based framework for superpixel segmentation. In the first stage, we introduce an efficient Deep Affinity Learning network that learns pairwise pixel affinities. In the second stage, we propose a highly efficient superpixel method called Hierarchical Entropy Rate (HERS)
arXiv Detail & Related papers (2021-06-07T16:20:04Z)
Implicit Integration of Superpixel Segmentation into Fully Convolutional Networks [11.696069523681178]
We propose a way to implicitly integrate a superpixel scheme into CNNs. Our proposed method hierarchically groups pixels at downsampling layers and generates superpixels. We evaluate our method on several tasks such as semantic segmentation, superpixel segmentation, and monocular depth estimation.
arXiv Detail & Related papers (2021-03-05T02:20:26Z)
AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids. Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z)
Superpixel Segmentation Based on Spatially Constrained Subspace Clustering [57.76302397774641]
We consider each representative region with independent semantic information as a subspace, and formulate superpixel segmentation as a subspace clustering problem. We show that a simple integration of superpixel segmentation with the conventional subspace clustering does not effectively work due to the spatial correlation of the pixels. We propose a novel convex locality-constrained subspace clustering model that is able to constrain the spatial adjacent pixels with similar attributes to be clustered into a superpixel.
arXiv Detail & Related papers (2020-12-11T06:18:36Z)
Superpixel Segmentation with Fully Convolutional Networks [32.878045921919714]
We present a novel method that employs a simple fully convolutional network to predict superpixels on a regular image grid. Experimental results on benchmark datasets show that our method achieves state-of-the-art superpixel segmentation performance. We modify a popular network architecture for stereo matching to simultaneously predict superpixels and disparities.
arXiv Detail & Related papers (2020-03-29T02:42:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.