ISP-AD: A Large-Scale Real-World Dataset for Advancing Industrial Anomaly Detection with Synthetic and Real Defects
- URL: http://arxiv.org/abs/2503.04997v3
- Date: Fri, 19 Sep 2025 16:07:12 GMT
- Title: ISP-AD: A Large-Scale Real-World Dataset for Advancing Industrial Anomaly Detection with Synthetic and Real Defects
- Authors: Paul J. Krassnig, Dieter P. Gruber,
- Abstract summary: TheISP-AD is the largest publicly available industrial dataset to date, including both synthetic and real defects collected directly from the factory floor.<n>Experiments show that even a small amount of injected, weakly labeled real defects improves generalization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic visual inspection using machine learning plays a key role in achieving zero-defect policies in industry. Research on anomaly detection is constrained by the availability of datasets that capture complex defect appearances and imperfect imaging conditions, which are typical of production processes. Recent benchmarks indicate that most publicly available datasets are biased towards optimal imaging conditions, leading to an overestimation of their applicability in real-world industrial scenarios. To address this gap, we introduce the Industrial Screen Printing Anomaly Detection Dataset (ISP-AD). It presents challenging small and weakly contrasted surface defects embedded within structured patterns exhibiting high permitted design variability. To the best of our knowledge, it is the largest publicly available industrial dataset to date, including both synthetic and real defects collected directly from the factory floor. Beyond benchmarking recent unsupervised anomaly detection methods, experiments on a mixed supervised training strategy, incorporating both synthesized and real defects, were conducted. Experiments show that even a small amount of injected, weakly labeled real defects improves generalization. Furthermore, starting from training on purely synthetic defects, emerging real defective samples can be efficiently integrated into subsequent scalable training. Overall, our findings indicate that model-free synthetic defects can provide a cold-start baseline, whereas a small number of injected real defects refine the decision boundary for previously unseen defect characteristics. The presented unsupervised and supervised dataset splits are designed to emphasize research on unsupervised, self-supervised, and supervised approaches, enhancing their applicability to industrial settings.
Related papers
- Evaluating Anomaly Detectors for Simulated Highly Imbalanced Industrial Classification Problems [1.376408511310322]
This paper presents a comprehensive evaluation of anomaly detection algorithms using a problem-agnostic simulated dataset.<n>We benchmark 14 detectors across training datasets with anomaly rates between 0.05% and 20% and training sizes between 1 000 and 10 000.<n>Our findings reveal that the best detector is highly dependant on the total number of faulty examples in the training dataset.
arXiv Detail & Related papers (2025-12-07T03:49:54Z) - Zero-Shot Multi-Criteria Visual Quality Inspection for Semi-Controlled Industrial Environments via Real-Time 3D Digital Twin Simulation [5.0268543063681195]
We propose a pose-agnostic, zero-shot quality inspection framework that compares real scenes against real-time Digital Twins (DT) in the RGB-D space.<n>Our approach enables efficient real-time DT rendering by semantically describing industrial scenes through object detection and pose estimation.<n>Based on an automotive use case featuring the quality inspection of an axial flux motor, we demonstrate the effectiveness of our framework.
arXiv Detail & Related papers (2025-11-28T14:19:31Z) - Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time [60.341117019125214]
We propose a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns in graph anomaly detection (GAD)<n>To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level.<n>Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.
arXiv Detail & Related papers (2025-11-10T12:10:05Z) - Unsupervised Learning for Industrial Defect Detection: A Case Study on Shearographic Data [0.0]
This study explores unsupervised learning methods for automated anomaly detection in shearographic images.<n>Three architectures are evaluated: a fully connected autoencoder, a convolutional autoencoder, and a student-teacher model.<n>Results show that the student-teacher approach achieves superior classification and enables precise localization.
arXiv Detail & Related papers (2025-11-04T12:48:02Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map [50.21082069320818]
We propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision.<n>Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks.<n>Results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data.
arXiv Detail & Related papers (2025-05-06T15:21:36Z) - Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift [51.24522135151649]
Anomaly detection plays a crucial role in quality control for industrial applications.
Existing methods attempt to address domain shifts by training generalizable models.
Our proposed method demonstrates superior results compared with state-of-the-art anomaly detection and domain adaptation methods.
arXiv Detail & Related papers (2025-03-19T05:25:52Z) - EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models [23.898938659720503]
Industrial Anomaly Detection (IAD) is critical to ensure product quality during manufacturing.
We propose a novel approach that introduces a dedicated multi-modal defect localization module to decouple the dialog functionality from the core feature extraction.
We also contribute to the first multi-modal industrial anomaly detection training dataset, named Defect Detection Question Answering (DDQA)
arXiv Detail & Related papers (2025-03-18T11:33:29Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - A PRISMA Driven Systematic Review of Publicly Available Datasets for Benchmark and Model Developments for Industrial Defect Detection [0.0]
A critical barrier to progress is the scarcity of comprehensive datasets featuring annotated defects.
This systematic review, spanning from 2015 to 2023, identifies 15 publicly available datasets.
The goal of this systematic review is to consolidate these datasets in a single location, providing researchers with a comprehensive reference.
arXiv Detail & Related papers (2024-06-11T20:14:59Z) - Condition Monitoring with Incomplete Data: An Integrated Variational Autoencoder and Distance Metric Framework [2.7898966850590625]
This paper introduces a new method for fault detection and condition monitoring for unseen data.
We use a variational autoencoder to capture the probabilistic distribution of previously seen and new unseen conditions.
Faults are detected by establishing a threshold for the health indexes, allowing the model to identify severe, unseen faults with high accuracy, even amidst noise.
arXiv Detail & Related papers (2024-04-08T22:20:23Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics [27.03052142039447]
We introduce the Defect Spectrum, a comprehensive benchmark that offers precise, semantic-abundant, and large-scale annotations for a wide range of industrial defects.
Building on four key industrial benchmarks, our dataset refines existing annotations and introduces rich semantic details, distinguishing multiple defect types within a single image.
We also introduce Defect-Gen, a two-stage diffusion-based generator designed to create high-quality and diverse defective images.
arXiv Detail & Related papers (2023-10-26T11:23:24Z) - Anomaly Detection in Automated Fibre Placement: Learning with Data
Limitations [3.103778949672542]
We present a comprehensive framework for defect detection and localization in Automated Fibre Placement.
Our approach combines unsupervised deep learning and classical computer vision algorithms.
It efficiently detects various surface issues while requiring fewer images of composite parts for training.
arXiv Detail & Related papers (2023-07-15T22:13:36Z) - Deep Learning based pipeline for anomaly detection and quality
enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space.
This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z) - Cognitive Visual Inspection Service for LCD Manufacturing Industry [80.63336968475889]
This paper discloses a novel visual inspection system for liquid crystal display (LCD), which is currently a dominant type in the FPD industry.
System is based on two cornerstones: robust/high-performance defect recognition model and cognitive visual inspection service architecture.
arXiv Detail & Related papers (2021-01-11T08:14:35Z) - Real-World Anomaly Detection by using Digital Twin Systems and
Weakly-Supervised Learning [3.0100975935933567]
We present novel weakly-supervised approaches to anomaly detection for industrial settings.
The approaches make use of a Digital Twin to generate a training dataset which simulates the normal operation of the machinery.
The performance of the proposed methods is compared against various state-of-the-art anomaly detection algorithms on an application to a real-world dataset.
arXiv Detail & Related papers (2020-11-12T10:15:56Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.