TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks
- URL: http://arxiv.org/abs/2312.09480v1
- Date: Fri, 15 Dec 2023 01:37:29 GMT
- Title: TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks
- Authors: Ho-Weng Lee, Shang-Hong Lai
- Abstract summary: We propose a novel framework to adeptly train a backbone model tailored to the manufacturing domain.
Our approach concurrently considers visual and text-aligned embedding spaces for normal and abnormal conditions.
The resulting pre-trained backbone markedly enhances performance in industrial downstream tasks.
- Score: 12.660226544498023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, the focus on anomaly detection and localization in
industrial inspection tasks has intensified. While existing studies have
demonstrated impressive outcomes, they often rely heavily on extensive training
datasets or robust features extracted from pre-trained models trained on
diverse datasets like ImageNet. In this work, we propose a novel framework
leveraging the visual-linguistic CLIP model to adeptly train a backbone model
tailored to the manufacturing domain. Our approach concurrently considers
visual and text-aligned embedding spaces for normal and abnormal conditions.
The resulting pre-trained backbone markedly enhances performance in industrial
downstream tasks, particularly in anomaly detection and localization. Notably,
this improvement is substantiated through experiments conducted on multiple
datasets such as MVTecAD, BTAD, and KSDD2. Furthermore, using our pre-trained
backbone weights allows previous works to achieve superior performance in
few-shot scenarios with less training data. The proposed anomaly backbone
provides a foundation model for more precise anomaly detection and
localization.
Related papers
- Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition [4.192370959537781]
We present a semi-supervised fine-tuning approach designed to improve the performance of pre-trained foundation models on downstream tasks with limited labeled data.
We evaluate our approach on multiple datasets, including MNIST, its augmented variations, CIFAR-10, SVHN, and GalaxyMNIST.
arXiv Detail & Related papers (2024-10-02T22:36:12Z) - Fractals as Pre-training Datasets for Anomaly Detection and Localization [0.0]
Anomaly detection is crucial in large-scale industrial manufacturing as it helps detect and localise defective parts.
Pre-training feature extractors on large-scale datasets is a popular approach for this task.
We evaluate the performance of eight state-of-the-art methods pre-trained using dynamically generated fractal images.
arXiv Detail & Related papers (2024-05-11T10:35:42Z) - Few-shot Online Anomaly Detection and Segmentation [29.693357653538474]
This paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task.
Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously.
In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation.
arXiv Detail & Related papers (2024-03-27T02:24:00Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - On the Out of Distribution Robustness of Foundation Models in Medical
Image Segmentation [47.95611203419802]
Foundations for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach.
We compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset.
We further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution data.
arXiv Detail & Related papers (2023-11-18T14:52:10Z) - Log-based Anomaly Detection of Enterprise Software: An Empirical Study [0.0]
We evaluate several state-of-the-art anomaly detection models on an industrial dataset from our research partner.
Results show that while all models are capable of detecting anomalies, certain models are better suited for less-structured datasets.
arXiv Detail & Related papers (2023-10-31T14:32:08Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Cross-Modal Fine-Tuning: Align then Refine [83.37294254884446]
ORCA is a cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities.
We show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities.
arXiv Detail & Related papers (2023-02-11T16:32:28Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.