Rethinking Superpixel Segmentation from Biologically Inspired Mechanisms
- URL: http://arxiv.org/abs/2309.13438v3
- Date: Wed, 11 Oct 2023 06:43:08 GMT
- Title: Rethinking Superpixel Segmentation from Biologically Inspired Mechanisms
- Authors: Tingyu Zhao, Bo Peng, Yuan Sun, Daipeng Yang, Zhenguang Zhang, and Xi
Wu
- Abstract summary: We propose a network architecture comprising an Enhanced Screening Module (ESM) and a novel Boundary-Aware Label (BAL) for superpixel segmentation.
The ESM enhances semantic information by simulating the interactive projection mechanisms of the visual cortex.
The BAL emulates the spatial frequency characteristics of visual cortical cells to facilitate the generation of superpixels with strong boundary adherence.
- Score: 8.24963839394421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, advancements in deep learning-based superpixel segmentation methods
have brought about improvements in both the efficiency and the performance of
segmentation. However, a significant challenge remains in generating
superpixels that strictly adhere to object boundaries while conveying rich
visual significance, especially when cross-surface color correlations may
interfere with objects. Drawing inspiration from neural structure and visual
mechanisms, we propose a biological network architecture comprising an Enhanced
Screening Module (ESM) and a novel Boundary-Aware Label (BAL) for superpixel
segmentation. The ESM enhances semantic information by simulating the
interactive projection mechanisms of the visual cortex. Additionally, the BAL
emulates the spatial frequency characteristics of visual cortical cells to
facilitate the generation of superpixels with strong boundary adherence. We
demonstrate the effectiveness of our approach through evaluations on both the
BSDS500 dataset and the NYUv2 dataset.
Related papers
- FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing [10.81951503398909]
Factorized Self-Attention Module (FSAM) computes multidimensional attention from voxel embeddings using nonnegative matrix factorization.
Our approach adeptly factorizes voxel embeddings to achieve comprehensive spatial, temporal, and channel attention, enhancing performance of generic signal extraction.
FactorizePhys is an end-to-end 3D-CNN architecture for estimating blood volume pulse signals from raw video frames.
arXiv Detail & Related papers (2024-11-03T12:22:58Z) - Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - Generalizable Entity Grounding via Assistance of Large Language Model [77.07759442298666]
We propose a novel approach to densely ground visual entities from a long caption.
We leverage a large multimodal model to extract semantic nouns, a class-a segmentation model to generate entity-level segmentation, and a multi-modal feature fusion module to associate each semantic noun with its corresponding segmentation mask.
arXiv Detail & Related papers (2024-02-04T16:06:05Z) - Spatial Structure Constraints for Weakly Supervised Semantic
Segmentation [100.0316479167605]
A class activation map (CAM) can only locate the most discriminative part of objects.
We propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.
Our approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively.
arXiv Detail & Related papers (2024-01-20T05:25:25Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - MDFL: Multi-domain Diffusion-driven Feature Learning [19.298491870280213]
We present a multi-domain diffusion-driven feature learning network (MDFL)
MDFL redefines the effective information domain that the model really focuses on.
We demonstrate that MDFL significantly improves the feature extraction performance of high-dimensional data.
arXiv Detail & Related papers (2023-11-16T02:55:21Z) - Laplacian-Former: Overcoming the Limitations of Vision Transformers in
Local Texture Detection [3.784298636620067]
Vision Transformer (ViT) models have demonstrated a breakthrough in a wide range of computer vision tasks.
These models struggle to capture high-frequency components of images, which can limit their ability to detect local textures and edge information.
We propose a new technique, Laplacian-Former, that enhances the self-attention map by adaptively re-calibrating the frequency information in a Laplacian pyramid.
arXiv Detail & Related papers (2023-08-31T19:56:14Z) - SAWU-Net: Spatial Attention Weighted Unmixing Network for Hyperspectral
Images [91.20864037082863]
We propose a spatial attention weighted unmixing network, dubbed as SAWU-Net, which learns a spatial attention network and a weighted unmixing network in an end-to-end manner.
In particular, we design a spatial attention module, which consists of a pixel attention block and a window attention block to efficiently model pixel-based spectral information and patch-based spatial information.
Experimental results on real and synthetic datasets demonstrate the better accuracy and superiority of SAWU-Net.
arXiv Detail & Related papers (2023-04-22T05:22:50Z) - Semantic-aware Texture-Structure Feature Collaboration for Underwater
Image Enhancement [58.075720488942125]
Underwater image enhancement has become an attractive topic as a significant technology in marine engineering and aquatic robotics.
We develop an efficient and compact enhancement network in collaboration with a high-level semantic-aware pretrained model.
We also apply the proposed algorithm to the underwater salient object detection task to reveal the favorable semantic-aware ability for high-level vision tasks.
arXiv Detail & Related papers (2022-11-19T07:50:34Z) - Rethinking Unsupervised Neural Superpixel Segmentation [6.123324869194195]
unsupervised learning for superpixel segmentation via CNNs has been studied.
We propose three key elements to improve the efficacy of such networks.
By experimenting with the BSDS500 dataset, we find evidence to the significance of our proposal.
arXiv Detail & Related papers (2022-06-21T09:30:26Z) - Superpixel-based Refinement for Object Proposal Generation [3.1981440103815717]
We introduce a new superpixel-based refinement approach on top of the state-of-the-art object proposal system AttentionMask.
Our experiments show an improvement of up to 26.4% in terms of average recall compared to original AttentionMask.
arXiv Detail & Related papers (2021-01-12T16:06:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.