SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures
- URL: http://arxiv.org/abs/2503.01113v3
- Date: Sun, 23 Mar 2025 13:59:45 GMT
- Title: SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures
- Authors: Hui Liu, Chen Jia, Fan Shi, Xu Cheng, Shengyong Chen,
- Abstract summary: We propose a lightweight Structure-Aware Vision Mamba Network (SCSegamba) to generate high-quality pixel-level segmentation maps.<n>Specifically, we developed a Structure-Aware Visual State Space module (SAVSS), which incorporates a lightweight Gated Bottleneck Convolution (GBC) and a Structure-Aware Scanning Strategy (SASS)<n> Experiments on crack benchmark datasets demonstrate that our method outperforms other state-of-the-art (SOTA) methods, achieving the highest performance with only 2.8M parameters.
- Score: 29.224360412743454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pixel-level segmentation of structural cracks across various scenarios remains a considerable challenge. Current methods encounter challenges in effectively modeling crack morphology and texture, facing challenges in balancing segmentation quality with low computational resource usage. To overcome these limitations, we propose a lightweight Structure-Aware Vision Mamba Network (SCSegamba), capable of generating high-quality pixel-level segmentation maps by leveraging both the morphological information and texture cues of crack pixels with minimal computational cost. Specifically, we developed a Structure-Aware Visual State Space module (SAVSS), which incorporates a lightweight Gated Bottleneck Convolution (GBC) and a Structure-Aware Scanning Strategy (SASS). The key insight of GBC lies in its effectiveness in modeling the morphological information of cracks, while the SASS enhances the perception of crack topology and texture by strengthening the continuity of semantic information between crack pixels. Experiments on crack benchmark datasets demonstrate that our method outperforms other state-of-the-art (SOTA) methods, achieving the highest performance with only 2.8M parameters. On the multi-scenario dataset, our method reached 0.8390 in F1 score and 0.8479 in mIoU. The code is available at https://github.com/Karl1109/SCSegamba.
Related papers
- Structural and Statistical Texture Knowledge Distillation and Learning for Segmentation [70.15341084443236]
We re-emphasize the low-level texture information in deep networks for semantic segmentation and related knowledge distillation tasks.
We propose a novel Structural and Statistical Texture Knowledge Distillation (SSTKD) framework for semantic segmentation.
Specifically, Contourlet Decomposition Module (CDM) is introduced to decompose the low-level features.
Texture Intensity Equalization Module (TIEM) is designed to extract and enhance the statistical texture knowledge.
arXiv Detail & Related papers (2025-03-11T04:49:25Z) - MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image Segmentation [6.578088710294546]
Traditional segmentation methods struggle to address challenges such as high anatomical variability, blurred tissue boundaries, low organ contrast, and noise.
We propose MLLA-UNet (Mamba-Like Linear Attention UNet), a novel architecture that achieves linear computational complexity while maintaining high segmentation accuracy.
Experiments demonstrate that MLLA-UNet achieves state-of-the-art performance on six challenging datasets with 24 different segmentation tasks, including but not limited to FLARE22, AMOS CT, and ACDC, with an average DSC of 88.32%.
arXiv Detail & Related papers (2024-10-31T08:54:23Z) - Topology-aware Mamba for Crack Segmentation in Structures [5.9184143707401775]
CrackMamba, a Mamba-based model, is designed for efficient and accurate crack segmentation for monitoring the structural health of infrastructure.
CrackMamba addresses these challenges by utilizing the VMambaV2 with pre-trained ImageNet-1k weights as the encoder and a newly designed decoder for better performance.
Experiments show that CrackMamba achieves state-of-the-art (SOTA) performance on the CrackSeg9k and SewerCrack datasets, and demonstrates competitive performance on the retinal vessel segmentation dataset CHASEunderlineDB1.
arXiv Detail & Related papers (2024-10-25T15:17:52Z) - EfficientCrackNet: A Lightweight Model for Crack Segmentation [1.3689715712707347]
Crack detection is crucial for maintaining the structural integrity of buildings, pavements, and bridges.
Existing lightweight methods often face challenges including computational inefficiency, complex crack patterns, and difficult backgrounds.
We propose EfficientCrackNet, a lightweight hybrid model combining Convolutional Neural Networks (CNNs) and transformers for precise crack segmentation.
arXiv Detail & Related papers (2024-09-26T17:44:20Z) - Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure [52.2025114590481]
We introduce Hybrid-Segmentor, an encoder-decoder based approach that is capable of extracting both fine-grained local and global crack features.
This allows the model to improve its generalization capabilities in distinguish various type of shapes, surfaces and sizes of cracks.
The proposed model outperforms existing benchmark models across 5 quantitative metrics (accuracy 0.971, precision 0.804, recall 0.744, F1-score 0.770, and IoU score 0.630), achieving state-of-the-art status.
arXiv Detail & Related papers (2024-09-04T16:47:16Z) - Staircase Cascaded Fusion of Lightweight Local Pattern Recognition and Long-Range Dependencies for Structural Crack Segmentation [28.157401919910914]
Existing methods struggle to integrate local textures and pixel dependencies of cracks.<n>We propose a lightweight convolutional block that substitutes all convolution operations, reducing the model's computational demands.<n>We develop a staircase cascaded fusion module, which seamlessly integrates local patterns and long-range dependencies.
arXiv Detail & Related papers (2024-08-23T03:21:51Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Improving Pixel-based MIM by Reducing Wasted Modeling Capability [77.99468514275185]
We propose a new method that explicitly utilizes low-level features from shallow layers to aid pixel reconstruction.
To the best of our knowledge, we are the first to systematically investigate multi-level feature fusion for isotropic architectures.
Our method yields significant performance gains, such as 1.2% on fine-tuning, 2.8% on linear probing, and 2.6% on semantic segmentation.
arXiv Detail & Related papers (2023-08-01T03:44:56Z) - Infrastructure Crack Segmentation: Boundary Guidance Method and
Benchmark Dataset [11.282003429161163]
This paper examines the inherent characteristics of cracks so as to introduce boundary features into crack identification.
It builds a boundary guidance crack segmentation model (BGCrack) with targeted structures and modules, including a high frequency module.
This paper provides a steel crack dataset that establishes a unified and fair benchmark for the identification of steel cracks.
arXiv Detail & Related papers (2023-06-15T15:25:53Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - Retinal Image Segmentation with a Structure-Texture Demixing Network [62.69128827622726]
The complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult.
Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance.
We propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance.
arXiv Detail & Related papers (2020-07-15T12:19:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.