MLMA-Net: multi-level multi-attentional learning for multi-label object
detection in textile defect images
- URL: http://arxiv.org/abs/2102.00376v1
- Date: Sun, 31 Jan 2021 04:50:40 GMT
- Title: MLMA-Net: multi-level multi-attentional learning for multi-label object
detection in textile defect images
- Authors: Bing Wei (Student Member, IEEE), Kuangrong Hao (Member, IEEE), Lei Gao
(Member, IEEE)
- Abstract summary: This paper proposes a multi-level, multi-attentional deep learning network (MLMA-Net) to detect multi-label defects in textile images.
The results show that the network extracts more distinctive features and has better performance than the state-of-the-art approaches on the real-world industrial dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For the sake of recognizing and classifying textile defects, deep
learning-based methods have been proposed and achieved remarkable success in
single-label textile images. However, detecting multi-label defects in a
textile image remains challenging due to the coexistence of multiple defects
and small-size defects. To address these challenges, a multi-level,
multi-attentional deep learning network (MLMA-Net) is proposed and built to 1)
increase the feature representation ability to detect small-size defects; 2)
generate a discriminative representation that maximizes the capability of
attending the defect status, which leverages higher-resolution feature maps for
multiple defects. Moreover, a multi-label object detection dataset (DHU-ML1000)
in textile defect images is built to verify the performance of the proposed
model. The results demonstrate that the network extracts more distinctive
features and has better performance than the state-of-the-art approaches on the
real-world industrial dataset.
Related papers
- Change-Aware Siamese Network for Surface Defects Segmentation under Complex Background [0.6407952035735353]
We propose a change-aware Siamese network that solves the defect segmentation in a change detection framework.
A novel multi-class balanced contrastive loss is introduced to guide the Transformer-based encoder.
The difference presented by a distance map is then skip-connected to the change-aware decoder to assist in the location of both inter-class and out-of-class pixel-wise defects.
arXiv Detail & Related papers (2024-09-01T02:48:11Z) - Multi-label Sewer Pipe Defect Recognition with Mask Attention Feature Enhancement and Label Correlation Learning [5.9184143707401775]
Multi-label pipe defect recognition is proposed based on mask attention guided feature enhancement and label correlation learning.
The proposed method can achieve current approximate state-of-the-art classification performance using just 1/16 of the Sewer-ML training dataset.
arXiv Detail & Related papers (2024-08-01T11:51:50Z) - Looking for Tiny Defects via Forward-Backward Feature Transfer [12.442574943138794]
We introduce a novel benchmark that evaluates methods on the original, high-resolution image and ground-truth masks.
Our benchmark includes a metric that captures robustness with respect to defect size.
Our proposal features the highest robustness to defect size, runs at the fastest speed and yields state-of-the-art segmentation performance.
arXiv Detail & Related papers (2024-07-04T17:59:26Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Improving Vision Anomaly Detection with the Guidance of Language
Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view.
We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue.
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z) - Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for
Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels.
This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z) - Few-Shot Defect Image Generation via Defect-Aware Feature Manipulation [19.018561017953957]
We propose the first defect image generation method in the challenging few-shot cases.
Our method consists of two training stages. First, we train a data-efficient StyleGAN2 on defect-free images as the backbone.
Second, we attach defect-aware residual blocks to the backbone, which learn to produce reasonable defect masks.
arXiv Detail & Related papers (2023-03-04T11:43:08Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - Computer Vision and Normalizing Flow Based Defect Detection [0.0]
We present a two-stage defect detection network based on the object detection model YOLO, and the normalizing flow-based defect detection model DifferNet.
Our model has high robustness and performance on defect detection using real-world video clips taken from a production line monitoring system.
Our proposed model can learn on a small number of defect-free samples of single or multiple product types.
arXiv Detail & Related papers (2020-12-12T05:38:21Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.