Joint Attention-Guided Feature Fusion Network for Saliency Detection of
Surface Defects
- URL: http://arxiv.org/abs/2402.02797v1
- Date: Mon, 5 Feb 2024 08:10:16 GMT
- Title: Joint Attention-Guided Feature Fusion Network for Saliency Detection of
Surface Defects
- Authors: Xiaoheng Jiang, Feng Yan, Yang Lu, Ke Wang, Shuai Guo, Tianzhu Zhang,
Yanwei Pang, Jianwei Niu, and Mingliang Xu
- Abstract summary: We propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.
JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features.
Experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods.
- Score: 69.39099029406248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Surface defect inspection plays an important role in the process of
industrial manufacture and production. Though Convolutional Neural Network
(CNN) based defect inspection methods have made huge leaps, they still confront
a lot of challenges such as defect scale variation, complex background, low
contrast, and so on. To address these issues, we propose a joint
attention-guided feature fusion network (JAFFNet) for saliency detection of
surface defects based on the encoder-decoder network. JAFFNet mainly
incorporates a joint attention-guided feature fusion (JAFF) module into
decoding stages to adaptively fuse low-level and high-level features. The JAFF
module learns to emphasize defect features and suppress background noise during
feature fusion, which is beneficial for detecting low-contrast defects. In
addition, JAFFNet introduces a dense receptive field (DRF) module following the
encoder to capture features with rich context information, which helps detect
defects of different scales. The JAFF module mainly utilizes a learned joint
channel-spatial attention map provided by high-level semantic features to guide
feature fusion. The attention map makes the model pay more attention to defect
features. The DRF module utilizes a sequence of multi-receptive-field (MRF)
units with each taking as inputs all the preceding MRF feature maps and the
original input. The obtained DRF features capture rich context information with
a large range of receptive fields. Extensive experiments conducted on
SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves
promising performance in comparison with other state-of-the-art methods.
Meanwhile, our method reaches a real-time defect detection speed of 66 FPS.
Related papers
- Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - CINFormer: Transformer network with multi-stage CNN feature injection
for surface defect segmentation [73.02218479926469]
We propose a transformer network with multi-stage CNN feature injection for surface defect segmentation.
CINFormer presents a simple yet effective feature integration mechanism that injects the multi-level CNN features of the input image into different stages of the transformer network in the encoder.
In addition, CINFormer presents a Top-K self-attention module to focus on tokens with more important information about the defects.
arXiv Detail & Related papers (2023-09-22T06:12:02Z) - StofNet: Super-resolution Time of Flight Network [8.395656453902685]
Time of Flight (ToF) is a prevalent depth sensing technology in the fields of robotics, medical imaging, and non-destructive testing.
This paper highlights the potential of modern super-resolution techniques to learn varying surroundings for a reliable and accurate ToF detection.
arXiv Detail & Related papers (2023-08-23T09:02:01Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.
Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.
Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - MSFA-Frequency-Aware Transformer for Hyperspectral Images Demosaicing [15.847332787718852]
This paper proposes a novel de-mosaicing framework, the MSFA-frequency-aware Transformer network (FDM-Net)
The advantage of Maformer is that it can leverage the MSFA information and non-local dependencies present in the data.
Our experimental results demonstrate that FDM-Net outperforms state-of-the-art methods with 6dB PSNR, and reconstructs high-fidelity details successfully.
arXiv Detail & Related papers (2023-03-23T16:27:30Z) - LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray
Image [7.970559381165446]
We propose a weld defect detection method based on convolution neural network (CNN), namely Lighter and Faster YOLO (LF-YOLO)
To improve the performance of detection network, we propose an efficient feature extraction (EFE) module.
Experimental results show that our weld defect network achieves satisfactory balance between performance and consumption, and reaches 92.9 mAP50 with 61.5 FPS.
arXiv Detail & Related papers (2021-10-28T12:19:32Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z) - Coupled Convolutional Neural Network with Adaptive Response Function
Learning for Unsupervised Hyperspectral Super-Resolution [28.798775822331045]
Hyperspectral super-resolution refers to fusing HSI and MSI to generate an image with both high spatial and high spectral resolutions.
In this work, an unsupervised deep learning-based fusion method - HyCoNet - that can solve the problems in HSI-MSI fusion without the prior PSF and SRF information is proposed.
arXiv Detail & Related papers (2020-07-28T06:17:02Z) - iffDetector: Inference-aware Feature Filtering for Object Detection [70.8678270164057]
We introduce a generic Inference-aware Feature Filtering (IFF) module that can easily be combined with modern detectors.
IFF performs closed-loop optimization by leveraging high-level semantics to enhance the convolutional features.
IFF can be fused with CNN-based object detectors in a plug-and-play manner with negligible computational cost overhead.
arXiv Detail & Related papers (2020-06-23T02:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.