Related papers: Prototype Learning-Based Few-Shot Segmentation for Low-Light Crack on Concrete Structures

Prototype Learning-Based Few-Shot Segmentation for Low-Light Crack on Concrete Structures

URL: http://arxiv.org/abs/2601.13059v1
Date: Mon, 19 Jan 2026 13:48:26 GMT
Title: Prototype Learning-Based Few-Shot Segmentation for Low-Light Crack on Concrete Structures
Authors: Yulun Guo,
Abstract summary: Crack detection is critical for concrete infrastructure safety, but real-world cracks often appear in low-light environments like tunnels and bridge undersides.<n>We propose a dual-branch prototype learning network integrating Retinex theory with few-shot learning for low-light crack segmentation.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Crack detection is critical for concrete infrastructure safety, but real-world cracks often appear in low-light environments like tunnels and bridge undersides, degrading computer vision segmentation accuracy. Pixel-level annotation of low-light crack images is extremely time-consuming, yet most deep learning methods require large, well-illuminated datasets. We propose a dual-branch prototype learning network integrating Retinex theory with few-shot learning for low-light crack segmentation. Retinex-based reflectance components guide illumination-invariant global representation learning, while metric learning reduces dependence on large annotated datasets. We introduce a cross-similarity prior mask generation module that computes high-dimensional similarities between query and support features to capture crack location and structure, and a multi-scale feature enhancement module that fuses multi-scale features with the prior mask to alleviate spatial inconsistency. Extensive experiments on multiple benchmarks demonstrate consistent state-of-the-art performance under low-light conditions. Code: https://github.com/YulunGuo/CrackFSS.

Related papers

Unleashing Degradation-Carrying Features in Symmetric U-Net: Simpler and Stronger Baselines for All-in-One Image Restoration [52.82397287366076]
All-in-one image restoration aims to handle diverse degradations (e.g., noise, blur, adverse weather) within a unified framework.<n>In this work, we reveal a critical insight: well-crafted feature extraction inherently encodes degradation-carrying information.<n>Our symmetric design preserves intrinsic degradation signals robustly, rendering simple additive fusion in skip connections.
arXiv Detail & Related papers (2025-12-11T12:20:31Z)
Manifold-aware Representation Learning for Degradation-agnostic Image Restoration [135.90908995927194]
Image Restoration (IR) aims to recover high quality images from degraded inputs affected by various corruptions such as noise, blur, haze, rain, and low light conditions.<n>We present MIRAGE, a unified framework for all in one IR that explicitly decomposes the input feature space into three semantically aligned parallel branches.<n>This modular decomposition significantly improves generalization and efficiency across diverse degradations.
arXiv Detail & Related papers (2025-05-24T12:52:10Z)
SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures [29.224360412743454]
We propose a lightweight Structure-Aware Vision Mamba Network (SCSegamba) to generate high-quality pixel-level segmentation maps.<n>Specifically, we developed a Structure-Aware Visual State Space module (SAVSS), which incorporates a lightweight Gated Bottleneck Convolution (GBC) and a Structure-Aware Scanning Strategy (SASS)<n> Experiments on crack benchmark datasets demonstrate that our method outperforms other state-of-the-art (SOTA) methods, achieving the highest performance with only 2.8M parameters.
arXiv Detail & Related papers (2025-03-03T02:40:57Z)
CrackSCF: Lightweight Cascaded Fusion Network for Robust and Efficient Structural Crack Segmentation [36.93774494071781]
CrackSCF is a lightweight Cascaded Fusion Crack Network designed to achieve robust crack segmentation.<n>This approach efficiently captures local patterns while operating with a minimal computational footprint.<n>The experimental results show that the CrackSCF method consistently outperforms the existing methods.
arXiv Detail & Related papers (2024-08-23T03:21:51Z)
CrackNex: a Few-shot Low-light Crack Segmentation Model Based on Retinex Theory for UAV Inspections [9.27355428681897]
CrackNex is a framework that utilizes reflectance information based on Retinex Theory to help the model learn a unified illumination-invariant representation. We present the first benchmark dataset, LCSD, for low-light crack segmentation. LCSD consists of 102 well-illuminated crack images and 41 low-light crack images.
arXiv Detail & Related papers (2024-03-05T15:52:54Z)
Single Image Reflection Separation via Component Synergy [14.57590565534889]
The reflection superposition phenomenon is complex and widely distributed in the real world. We propose a more general form of the superposition model by introducing a learnable residue term. In order to fully capitalize on its advantages, we further design the network structure elaborately.
arXiv Detail & Related papers (2023-08-19T14:25:27Z)
Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision. Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes. We introduce the bilevel paradigm to model the above latent correspondence. A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z)
Multi-spectral Class Center Network for Face Manipulation Detection and Localization [52.569170436393165]
We propose a novel Multi-Spectral Class Center Network (MSCCNet) for face manipulation detection and localization. Based on the features of different frequency bands, the MSCC module collects multi-spectral class centers and computes pixel-to-class relations. Applying multi-spectral class-level representations suppresses the semantic information of the visual concepts which is insensitive to manipulated regions of forgery images.
arXiv Detail & Related papers (2023-05-18T08:09:20Z)
Toward Fast, Flexible, and Robust Low-Light Image Enhancement [87.27326390675155]
We develop a new Self-Calibrated Illumination (SCI) learning framework for fast, flexible, and robust brightening images in real-world low-light scenarios. Considering the computational burden of the cascaded pattern, we construct the self-calibrated module which realizes the convergence between results of each stage. We make comprehensive explorations to SCI's inherent properties including operation-insensitive adaptability and model-irrelevant generality.
arXiv Detail & Related papers (2022-04-21T14:40:32Z)
PatchMVSNet: Patch-wise Unsupervised Multi-View Stereo for Weakly-Textured Surface Reconstruction [2.9896482273918434]
This paper proposes robust loss functions leveraging constraints beneath multi-view images to alleviate matching ambiguity. Our strategy can be implemented with arbitrary depth estimation frameworks and can be trained with arbitrary large-scale MVS datasets. Our method reaches the performance of the state-of-the-art methods on popular benchmarks, like DTU, Tanks and Temples and ETH3D.
arXiv Detail & Related papers (2022-03-04T07:05:23Z)
Learning Deep Context-Sensitive Decomposition for Low-Light Image Enhancement [58.72667941107544]
A typical framework is to simultaneously estimate the illumination and reflectance, but they disregard the scene-level contextual information encapsulated in feature spaces. We develop a new context-sensitive decomposition network architecture to exploit the scene-level contextual dependencies on spatial scales. We develop a lightweight CSDNet (named LiteCSDNet) by reducing the number of channels.
arXiv Detail & Related papers (2021-12-09T06:25:30Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.