CINFormer: Transformer network with multi-stage CNN feature injection
for surface defect segmentation
- URL: http://arxiv.org/abs/2309.12639v1
- Date: Fri, 22 Sep 2023 06:12:02 GMT
- Title: CINFormer: Transformer network with multi-stage CNN feature injection
for surface defect segmentation
- Authors: Xiaoheng Jiang, Kaiyi Guo, Yang Lu, Feng Yan, Hao Liu, Jiale Cao,
Mingliang Xu, and Dacheng Tao
- Abstract summary: We propose a transformer network with multi-stage CNN feature injection for surface defect segmentation.
CINFormer presents a simple yet effective feature integration mechanism that injects the multi-level CNN features of the input image into different stages of the transformer network in the encoder.
In addition, CINFormer presents a Top-K self-attention module to focus on tokens with more important information about the defects.
- Score: 73.02218479926469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Surface defect inspection is of great importance for industrial manufacture
and production. Though defect inspection methods based on deep learning have
made significant progress, there are still some challenges for these methods,
such as indistinguishable weak defects and defect-like interference in the
background. To address these issues, we propose a transformer network with
multi-stage CNN (Convolutional Neural Network) feature injection for surface
defect segmentation, which is a UNet-like structure named CINFormer. CINFormer
presents a simple yet effective feature integration mechanism that injects the
multi-level CNN features of the input image into different stages of the
transformer network in the encoder. This can maintain the merit of CNN
capturing detailed features and that of transformer depressing noises in the
background, which facilitates accurate defect detection. In addition, CINFormer
presents a Top-K self-attention module to focus on tokens with more important
information about the defects, so as to further reduce the impact of the
redundant background. Extensive experiments conducted on the surface defect
datasets DAGM 2007, Magnetic tile, and NEU show that the proposed CINFormer
achieves state-of-the-art performance in defect detection.
Related papers
- Change-Aware Siamese Network for Surface Defects Segmentation under Complex Background [0.6407952035735353]
We propose a change-aware Siamese network that solves the defect segmentation in a change detection framework.
A novel multi-class balanced contrastive loss is introduced to guide the Transformer-based encoder.
The difference presented by a distance map is then skip-connected to the change-aware decoder to assist in the location of both inter-class and out-of-class pixel-wise defects.
arXiv Detail & Related papers (2024-09-01T02:48:11Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Joint Attention-Guided Feature Fusion Network for Saliency Detection of
Surface Defects [69.39099029406248]
We propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.
JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features.
Experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T08:10:16Z) - Characterization of Magnetic Labyrinthine Structures Through Junctions and Terminals Detection Using Template Matching and CNN [0.0]
Defects in magnetic labyrinthine patterns, called junctions and terminals, serve as points of interest.
This study introduces a new technique called TM-CNN designed to detect a multitude of small objects in images.
TM-CNN achieved an impressive F1 score of 0.991, far outperforming traditional template matching and CNN-based object detection algorithms.
arXiv Detail & Related papers (2024-01-30T02:23:07Z) - Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z) - DANet: Enhancing Small Object Detection through an Efficient Deformable
Attention Network [0.0]
We propose a comprehensive strategy by synergizing Faster R-CNN with cutting-edge methods.
By combining Faster R-CNN with Feature Pyramid Network, we enable the model to handle multi-scale features intrinsic to manufacturing environments.
Deformable Net is used that contorts and conforms to the geometric variations of defects, bringing precision in detecting even the minuscule and complex features.
arXiv Detail & Related papers (2023-10-09T14:54:37Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Local Distortion Aware Efficient Transformer Adaptation for Image
Quality Assessment [62.074473976962835]
We show that with proper injection of local distortion features, a larger pretrained and fixed foundation model performs better in IQA tasks.
Specifically, for the lack of local distortion structure and inductive bias of vision transformer (ViT), we use another pretrained convolution neural network (CNN)
We propose a local distortion extractor to obtain local distortion features from the pretrained CNN and a local distortion injector to inject the local distortion features into ViT.
arXiv Detail & Related papers (2023-08-23T08:41:21Z) - A New Knowledge Distillation Network for Incremental Few-Shot Surface
Defect Detection [20.712532953953808]
This paper proposes a new knowledge distillation network, called Dual Knowledge Align Network (DKAN)
The proposed DKAN method follows a pretraining-finetuning transfer learning paradigm and a knowledge distillation framework is designed for fine-tuning.
Experiments have been conducted on the incremental Few-shot NEU-DET dataset and results show that DKAN outperforms other methods on various few-shot scenes.
arXiv Detail & Related papers (2022-09-01T15:08:44Z) - Defect Transformer: An Efficient Hybrid Transformer Architecture for
Surface Defect Detection [2.0999222360659604]
We propose an efficient hybrid transformer architecture, termed Defect Transformer (DefT), for surface defect detection.
DefT incorporates CNN and transformer into a unified model to capture local and non-local relationships collaboratively.
Experiments on three datasets demonstrate the superiority and efficiency of our method compared with other CNN- and transformer-based networks.
arXiv Detail & Related papers (2022-07-17T23:37:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.