A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data
- URL: http://arxiv.org/abs/2509.18354v1
- Date: Mon, 22 Sep 2025 19:29:20 GMT
- Title: A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data
- Authors: Mehrdad Moradi, Shengzhe Chen, Hao Yan, Kamran Paynabar,
- Abstract summary: Anomaly detection in images is typically addressed by learning from collections of training data or relying on reference samples.<n>We propose a single-image anomaly localization method that leverages the inductive bias of convolutional neural networks.<n>Our method is named Single Shot Decomposition Network (SSDnet)
- Score: 4.861045498353029
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomaly detection in images is typically addressed by learning from collections of training data or relying on reference samples. In many real-world scenarios, however, such training data may be unavailable, and only the test image itself is provided. We address this zero-shot setting by proposing a single-image anomaly localization method that leverages the inductive bias of convolutional neural networks, inspired by Deep Image Prior (DIP). Our method is named Single Shot Decomposition Network (SSDnet). Our key assumption is that natural images often exhibit unified textures and patterns, and that anomalies manifest as localized deviations from these repetitive or stochastic patterns. To learn the deep image prior, we design a patch-based training framework where the input image is fed directly into the network for self-reconstruction, rather than mapping random noise to the image as done in DIP. To avoid the model simply learning an identity mapping, we apply masking, patch shuffling, and small Gaussian noise. In addition, we use a perceptual loss based on inner-product similarity to capture structure beyond pixel fidelity. Our approach needs no external training data, labels, or references, and remains robust in the presence of noise or missing pixels. SSDnet achieves 0.99 AUROC and 0.60 AUPRC on MVTec-AD and 0.98 AUROC and 0.67 AUPRC on the fabric dataset, outperforming state-of-the-art methods. The implementation code will be released at https://github.com/mehrdadmoradi124/SSDnet
Related papers
- Training Free Zero-Shot Visual Anomaly Localization via Diffusion Inversion [15.486565360380203]
Zero-Shot image Anomaly Detection (ZSAD) aims to detect and localise anomalies without access to any normal training samples of the target data.<n>Recent approaches leverage additional modalities such as language to generate fine-grained prompts for localisation.<n>We introduce a training-free vision-only ZSAD framework that circumvents the need for fine-grained prompts.
arXiv Detail & Related papers (2026-01-12T21:55:31Z) - Training-free Detection of AI-generated images via Cropping Robustness [33.85512004342153]
WaRPAD is a training-free AI-generated image detection algorithm based on self-supervised models.<n>We show that WaRPAD consistently achieves competitive performance and demonstrates strong robustness to test-time corruptions.
arXiv Detail & Related papers (2025-11-18T01:21:47Z) - Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model [92.61216319417208]
We propose a novel diffusion model (DM)-based framework, dubbed ours, for image deblurring.<n>ours performs DM to generate the prior knowledge that aids in recovering the textures of blurry images.<n>To fully exploit the generated texture priors, we present the Texture Transfer Transformer layer (TTformer)
arXiv Detail & Related papers (2025-07-18T01:50:31Z) - Deepfake Detection of Face Images based on a Convolutional Neural Network [0.0]
Fake News and especially deepfakes (generated, non-real image or video content) have become a serious topic over the last years.<n>We want to build a model based on a Convolutions Neural Network in order to detect such generated and fake images showing human portraits.
arXiv Detail & Related papers (2025-03-14T13:33:22Z) - ZeroStereo: Zero-shot Stereo Matching from Single Images [17.560148513475387]
We propose ZeroStereo, a novel stereo image generation pipeline for zero-shot stereo matching.<n>Our approach synthesizes high-quality right images by leveraging pseudo disparities generated by a monocular depth estimation model.<n>Our pipeline achieves state-of-the-art zero-shot generalization across multiple datasets with only a dataset volume comparable to Scene Flow.
arXiv Detail & Related papers (2025-01-15T08:43:48Z) - Data Attribution for Text-to-Image Models by Unlearning Synthesized Images [71.23012718682634]
The goal of data attribution for text-to-image models is to identify the training images that most influence the generation of a new image.<n>We propose an efficient data attribution method by simulating unlearning the synthesized image.<n>We then identify training images with significant loss deviations after the unlearning process and label these as influential.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Semantic-aware Dense Representation Learning for Remote Sensing Image
Change Detection [20.761672725633936]
Training deep learning-based change detection model heavily depends on labeled data.
Recent trend is using remote sensing (RS) data to obtain in-domain representations via supervised or self-supervised learning (SSL)
We propose dense semantic-aware pre-training for RS image CD via sampling multiple class-balanced points.
arXiv Detail & Related papers (2022-05-27T06:08:33Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z) - Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset.
We develop an algorithm for shape-texture debiased learning.
Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z) - Cross-Scale Internal Graph Neural Network for Image Super-Resolution [147.77050877373674]
Non-local self-similarity in natural images has been well studied as an effective prior in image restoration.
For single image super-resolution (SISR), most existing deep non-local methods only exploit similar patches within the same scale of the low-resolution (LR) input image.
This is achieved using a novel cross-scale internal graph neural network (IGNN)
arXiv Detail & Related papers (2020-06-30T10:48:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.