Investigating Shift Equivalence of Convolutional Neural Networks in
Industrial Defect Segmentation
- URL: http://arxiv.org/abs/2309.16902v1
- Date: Fri, 29 Sep 2023 00:04:47 GMT
- Title: Investigating Shift Equivalence of Convolutional Neural Networks in
Industrial Defect Segmentation
- Authors: Zhen Qu, Xian Tao, Fei Shen, Zhengtao Zhang, Tao Li
- Abstract summary: In industrial defect segmentation tasks, output consistency (also referred to equivalence) of the model is often overlooked.
A novel pair of down/upsampling layers called component attention polyphase sampling (CAPS) is proposed as a replacement for the conventional sampling layers in CNNs.
The experimental results on the micro surface defect (MSD) dataset and four real-world industrial defect datasets demonstrate that the proposed method exhibits higher equivalence and segmentation performance.
- Score: 3.843350895842836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In industrial defect segmentation tasks, while pixel accuracy and
Intersection over Union (IoU) are commonly employed metrics to assess
segmentation performance, the output consistency (also referred to equivalence)
of the model is often overlooked. Even a small shift in the input image can
yield significant fluctuations in the segmentation results. Existing
methodologies primarily focus on data augmentation or anti-aliasing to enhance
the network's robustness against translational transformations, but their shift
equivalence performs poorly on the test set or is susceptible to nonlinear
activation functions. Additionally, the variations in boundaries resulting from
the translation of input images are consistently disregarded, thus imposing
further limitations on the shift equivalence. In response to this particular
challenge, a novel pair of down/upsampling layers called component attention
polyphase sampling (CAPS) is proposed as a replacement for the conventional
sampling layers in CNNs. To mitigate the effect of image boundary variations on
the equivalence, an adaptive windowing module is designed in CAPS to adaptively
filter out the border pixels of the image. Furthermore, a component attention
module is proposed to fuse all downsampled features to improve the segmentation
performance. The experimental results on the micro surface defect (MSD) dataset
and four real-world industrial defect datasets demonstrate that the proposed
method exhibits higher equivalence and segmentation performance compared to
other state-of-the-art methods.Our code will be available at
https://github.com/xiaozhen228/CAPS.
Related papers
- PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - Change-Aware Siamese Network for Surface Defects Segmentation under Complex Background [0.6407952035735353]
We propose a change-aware Siamese network that solves the defect segmentation in a change detection framework.
A novel multi-class balanced contrastive loss is introduced to guide the Transformer-based encoder.
The difference presented by a distance map is then skip-connected to the change-aware decoder to assist in the location of both inter-class and out-of-class pixel-wise defects.
arXiv Detail & Related papers (2024-09-01T02:48:11Z) - Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling [14.731788603429774]
Downsampling operators break the shift invariance of convolutional neural networks (CNNs)
We propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS)
TIPS results in consistent performance gains in terms of accuracy, shift consistency, and shift fidelity.
arXiv Detail & Related papers (2024-04-11T00:49:38Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Adapting the Hypersphere Loss Function from Anomaly Detection to Anomaly
Segmentation [1.9458156037869137]
We propose an incremental improvement to Fully Convolutional Data Description (FCDD)
FCDD is an adaptation of the one-class classification approach from anomaly detection to image anomaly segmentation (a.k.a. anomaly localization)
We analyze its original loss function and propose a substitute that better resembles its predecessor, the Hypersphere (HSC)
arXiv Detail & Related papers (2023-01-23T18:06:35Z) - Treatment Learning Causal Transformer for Noisy Image Classification [62.639851972495094]
In this work, we incorporate this binary information of "existence of noise" as treatment into image classification tasks to improve prediction accuracy.
Motivated from causal variational inference, we propose a transformer-based architecture, that uses a latent generative model to estimate robust feature representations for noise image classification.
We also create new noisy image datasets incorporating a wide range of noise factors for performance benchmarking.
arXiv Detail & Related papers (2022-03-29T13:07:53Z) - Dispensed Transformer Network for Unsupervised Domain Adaptation [21.256375606219073]
A novel unsupervised domain adaptation (UDA) method named dispensed Transformer network (DTNet) is introduced in this paper.
Our proposed network achieves the best performance in comparison with several state-of-the-art techniques.
arXiv Detail & Related papers (2021-10-28T08:27:44Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z) - Self-supervised Equivariant Attention Mechanism for Weakly Supervised
Semantic Segmentation [93.83369981759996]
We propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap.
Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation.
We propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning.
arXiv Detail & Related papers (2020-04-09T14:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.