Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation
- URL: http://arxiv.org/abs/2403.06247v2
- Date: Tue, 26 Mar 2024 14:42:21 GMT
- Title: Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation
- Authors: Mingyu Lee, Jongwon Choi,
- Abstract summary: We propose a text-guided variational image generation method to address the challenge of getting clean data for anomaly detection in industrial manufacturing.
Our method utilizes text information about the target object, learned from extensive text library documents, to generate non-defective data images resembling the input image.
- Score: 6.861600385661363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a text-guided variational image generation method to address the challenge of getting clean data for anomaly detection in industrial manufacturing. Our method utilizes text information about the target object, learned from extensive text library documents, to generate non-defective data images resembling the input image. The proposed framework ensures that the generated non-defective images align with anticipated distributions derived from textual and image-based knowledge, ensuring stability and generality. Experimental results demonstrate the effectiveness of our approach, surpassing previous methods even with limited non-defective data. Our approach is validated through generalization tests across four baseline models and three distinct datasets. We present an additional analysis to enhance the effectiveness of anomaly detection models by utilizing the generated images.
Related papers
- EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics [12.315852697312195]
This paper introduces a new splicing image detection algorithm based on the statistical characteristics of natural images.
By analyzing the limitations of traditional methods, we have developed a detection framework that integrates advanced statistical analysis techniques and machine learning methods.
The algorithm has been validated using multiple public datasets, showing high accuracy in detecting spliced edges and locating tampered areas.
arXiv Detail & Related papers (2024-04-25T02:28:16Z) - Detecting Generated Images by Real Images Only [64.12501227493765]
Existing generated image detection methods detect visual artifacts in generated images or learn discriminative features from both real and generated images by massive training.
This paper approaches the generated image detection problem from a new perspective: Start from real images.
By finding the commonality of real images and mapping them to a dense subspace in feature space, the goal is that generated images, regardless of their generative model, are then projected outside the subspace.
arXiv Detail & Related papers (2023-11-02T03:09:37Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - UMat: Uncertainty-Aware Single Image High Resolution Material Capture [2.416160525187799]
We propose a learning-based method to recover normals, specularity, and roughness from a single diffuse image of a material.
Our method is the first one to deal with the problem of modeling uncertainty in material digitization.
arXiv Detail & Related papers (2023-05-25T17:59:04Z) - Training on Thin Air: Improve Image Classification with Generated Data [28.96941414724037]
Diffusion Inversion is a simple yet effective method to generate diverse, high-quality training data for image classification.
Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion.
We identify three key components that allow our generated images to successfully supplant the original dataset.
arXiv Detail & Related papers (2023-05-24T16:33:02Z) - Taming Encoder for Zero Fine-tuning Image Customization with
Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users.
The method is based on a general framework that bypasses the lengthy optimization required by previous approaches.
We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z) - Benchmarking performance of object detection under image distortions in
an uncontrolled environment [0.483420384410068]
robustness of object detection algorithms plays a prominent role in real-world applications.
It has been proven that the performance of object detection methods suffers from in-capture distortions.
We present a performance evaluation framework for the state-of-the-art object detection methods.
arXiv Detail & Related papers (2022-10-28T09:06:52Z) - Markup-to-Image Diffusion Models with Scheduled Sampling [111.30188533324954]
Building on recent advances in image generation, we present a data-driven approach to rendering markup into images.
The approach is based on diffusion models, which parameterize the distribution of data using a sequence of denoising operations.
We conduct experiments on four markup datasets: mathematical formulas (La), table layouts (HTML), sheet music (LilyPond), and molecular images (SMILES)
arXiv Detail & Related papers (2022-10-11T04:56:12Z) - Transformation Consistency Regularization- A Semi-Supervised Paradigm
for Image-to-Image Translation [18.870983535180457]
We propose Transformation Consistency Regularization, which delves into a more challenging setting of image-to-image translation.
We evaluate the efficacy of our algorithm on three different applications: image colorization, denoising and super-resolution.
Our method is significantly data efficient, requiring only around 10 - 20% of labeled samples to achieve similar image reconstructions to its fully-supervised counterpart.
arXiv Detail & Related papers (2020-07-15T17:41:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.