No-Service Rail Surface Defect Segmentation via Normalized Attention and
Dual-scale Interaction
- URL: http://arxiv.org/abs/2306.15442v1
- Date: Tue, 27 Jun 2023 12:58:16 GMT
- Title: No-Service Rail Surface Defect Segmentation via Normalized Attention and
Dual-scale Interaction
- Authors: Gongyang Li and Chengjun Han and Zhi Liu
- Abstract summary: No-service rail surface defect (NRSD) segmentation is an essential way for perceiving the quality of no-service rails.
Existing natural image segmentation methods cannot achieve promising performance in NRSD images.
We propose a novel segmentation network for NRSDs based on Normalized Attention and Dual-scale Interaction, named NaDiNet.
- Score: 13.150295919228013
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: No-service rail surface defect (NRSD) segmentation is an essential way for
perceiving the quality of no-service rails. However, due to the complex and
diverse outlines and low-contrast textures of no-service rails, existing
natural image segmentation methods cannot achieve promising performance in NRSD
images, especially in some unique and challenging NRSD scenes. To this end, in
this paper, we propose a novel segmentation network for NRSDs based on
Normalized Attention and Dual-scale Interaction, named NaDiNet. Specifically,
NaDiNet follows the enhancement-interaction paradigm. The Normalized
Channel-wise Self-Attention Module (NAM) and the Dual-scale Interaction Block
(DIB) are two key components of NaDiNet. NAM is a specific extension of the
channel-wise self-attention mechanism (CAM) to enhance features extracted from
low-contrast NRSD images. The softmax layer in CAM will produce very small
correlation coefficients which are not conducive to low-contrast feature
enhancement. Instead, in NAM, we directly calculate the normalized correlation
coefficient between channels to enlarge the feature differentiation. DIB is
specifically designed for the feature interaction of the enhanced features. It
has two interaction branches with dual scales, one for fine-grained clues and
the other for coarse-grained clues. With both branches working together, DIB
can perceive defect regions of different granularities. With these modules
working together, our NaDiNet can generate accurate segmentation map. Extensive
experiments on the public NRSD-MN dataset with man-made and natural NRSDs
demonstrate that our proposed NaDiNet with various backbones (i.e., VGG,
ResNet, and DenseNet) consistently outperforms 10 state-of-the-art methods. The
code and results of our method are available at
https://github.com/monxxcn/NaDiNet.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - NICE: Improving Panoptic Narrative Detection and Segmentation with
Cascading Collaborative Learning [77.95710025273218]
We propose a unified framework called NICE that can jointly learn two panoptic narrative recognition tasks.
By linking PNS and PND in series with the barycenter of segmentation as the anchor, our approach naturally aligns the two tasks.
NICE surpasses all existing methods by a large margin, achieving 4.1% for PND and 2.9% for PNS over the state-of-the-art.
arXiv Detail & Related papers (2023-10-17T03:42:12Z) - Multi-Dimensional Refinement Graph Convolutional Network with Robust
Decouple Loss for Fine-Grained Skeleton-Based Action Recognition [19.031036881780107]
We propose a flexible attention block called Channel-Variable Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of spatial-temporal joints.
Based on CVSTA, we construct a Multi-Dimensional Refinement Graph Convolutional Network (MDR-GCN), which can improve the discrimination among channel-, joint- and frame-level features.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly boosts the effect of the CVSTA and reduces the impact of noise.
arXiv Detail & Related papers (2023-06-27T09:23:36Z) - Towards Stable Co-saliency Detection and Object Co-segmentation [12.979401244603661]
We present a novel model for simultaneous stable co-saliency detection (CoSOD) and object co-segmentation (CoSEG)
We first propose a multi-path stable recurrent unit (MSRU), containing dummy orders mechanisms (DOM) and recurrent unit (RU)
Our proposed MSRU not only helps CoSOD (CoSEG) model captures robust inter-image relations, but also reduces order-sensitivity, resulting in a more stable inference and training process.
arXiv Detail & Related papers (2022-09-25T03:58:49Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - RSI-Net: Two-Stream Deep Neural Network Integrating GCN and Atrous CNN
for Semantic Segmentation of High-resolution Remote Sensing Images [3.468780866037609]
Two-stream deep neural network for semantic segmentation of remote sensing images (RSI-Net) is proposed in this paper.
Experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets.
Results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
arXiv Detail & Related papers (2021-09-19T15:57:20Z) - CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared
Person Re-Identification [102.89434996930387]
VI-ReID aims to match cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment.
Existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations.
We propose a novel method, named Cross-Modality Neural Architecture Search (CM-NAS)
arXiv Detail & Related papers (2021-01-21T07:07:00Z) - Accurate and Lightweight Image Super-Resolution with Model-Guided Deep
Unfolding Network [63.69237156340457]
We present and advocate an explainable approach toward SISR named model-guided deep unfolding network (MoG-DUN)
MoG-DUN is accurate (producing fewer aliasing artifacts), computationally efficient (with reduced model parameters), and versatile (capable of handling multiple degradations)
The superiority of the proposed MoG-DUN method to existing state-of-theart image methods including RCAN, SRDNF, and SRFBN is substantiated by extensive experiments on several popular datasets and various degradation scenarios.
arXiv Detail & Related papers (2020-09-14T08:23:37Z) - A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing
Images [30.39131853354783]
This paper presents a novel deep neural network structure for pixel-wise sea-land segmentation, a Residual Dense U-Net (RDU-Net)
RDU-Net is a combination of both down-sampling and up-sampling paths to achieve satisfactory results.
arXiv Detail & Related papers (2020-03-17T16:00:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.