Wavelet-Enhanced PaDiM for Industrial Anomaly Detection
- URL: http://arxiv.org/abs/2508.16034v1
- Date: Fri, 22 Aug 2025 01:37:15 GMT
- Title: Wavelet-Enhanced PaDiM for Industrial Anomaly Detection
- Authors: Cory Gardner, Byungseok Min, Tae-Hyuk Ahn,
- Abstract summary: We propose Wavelet-Enhanced PaDiM, which integrates Discrete Wavelet Transform analysis with multi-layer CNN features in a structured manner.<n>We evaluate WE-PaDiM on the challenging MVTec AD dataset with multiple backbones.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Anomaly detection and localization in industrial images are essential for automated quality inspection. PaDiM, a prominent method, models the distribution of normal image features extracted by pre-trained Convolutional Neural Networks (CNNs) but reduces dimensionality through random channel selection, potentially discarding structured information. We propose Wavelet-Enhanced PaDiM (WE-PaDiM), which integrates Discrete Wavelet Transform (DWT) analysis with multi-layer CNN features in a structured manner. WE-PaDiM applies 2D DWT to feature maps from multiple backbone layers, selects specific frequency subbands (e.g., LL, LH, HL), spatially aligns them, and concatenates them channel-wise before modeling with PaDiM's multivariate Gaussian framework. This DWT-before-concatenation strategy provides a principled method for feature selection based on frequency content relevant to anomalies, leveraging multi-scale wavelet information as an alternative to random selection. We evaluate WE-PaDiM on the challenging MVTec AD dataset with multiple backbones (ResNet-18 and EfficientNet B0-B6). The method achieves strong performance in anomaly detection and localization, yielding average results of 99.32% Image-AUC and 92.10% Pixel-AUC across 15 categories with per-class optimized configurations. Our analysis shows that wavelet choices affect performance trade-offs: simpler wavelets (e.g., Haar) with detail subbands (HL or LH/HL/HH) often enhance localization, while approximation bands (LL) improve image-level detection. WE-PaDiM thus offers a competitive and interpretable alternative to random feature selection in PaDiM, achieving robust results suitable for industrial inspection with comparable efficiency.
Related papers
- MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings [7.9595266728435545]
We propose a novel end-to-end model, "Multiresolution EEGWaveNet (MR-EEGWaveNet)," which efficiently distinguishes seizure events from background electrogram (EEG) artifacts/noise.<n>The model has three modules: convolution, feature extraction, predictor.<n>The proposed MR-EEGWaveNet significantly outperformed the conventional non-multiresolution approach.
arXiv Detail & Related papers (2025-05-23T14:40:50Z) - 3D Wavelet Convolutions with Extended Receptive Fields for Hyperspectral Image Classification [12.168520751389622]
Deep neural networks face numerous challenges in hyperspectral image classification.<n>This paper proposes WCNet, an improved 3D-DenseNet model integrated with wavelet transforms.<n> Experimental results demonstrate superior performance on the IN, UP, and KSC datasets.
arXiv Detail & Related papers (2025-04-15T01:39:42Z) - FE-UNet: Frequency Domain Enhanced U-Net for Low-Frequency Information-Rich Image Segmentation [48.034848981295525]
We address the differences in frequency band sensitivity between CNNs and the human visual system.<n>We propose a wavelet adaptive spectrum fusion (WASF) method inspired by biological vision mechanisms to balance cross-frequency image features.<n>We develop the FE-UNet model, which employs a SAM2 backbone network and incorporates fine-tuned Hiera-Large modules to ensure segmentation accuracy.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - Sampling From Autoencoders' Latent Space via Quantization And
Probability Mass Function Concepts [1.534667887016089]
We introduce a novel post-training sampling algorithm rooted in the concept of probability mass functions, coupled with a quantization process.
Our proposed algorithm establishes a vicinity around each latent vector from the input data and then proceeds to draw samples from these defined neighborhoods.
This strategic approach ensures that the sampled latent vectors predominantly inhabit high-probability regions, which, in turn, can be effectively transformed into authentic real-world images.
arXiv Detail & Related papers (2023-08-21T13:18:12Z) - One-Dimensional Deep Image Prior for Curve Fitting of S-Parameters from
Electromagnetic Solvers [57.441926088870325]
Deep Image Prior (DIP) is a technique that optimized the weights of a randomly-d convolutional neural network to fit a signal from noisy or under-determined measurements.
Relative to publicly available implementations of Vector Fitting (VF), our method shows superior performance on nearly all test examples.
arXiv Detail & Related papers (2023-06-06T20:28:37Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Hyperspectral Band Selection for Multispectral Image Classification with
Convolutional Networks [0.0]
We propose a novel band selection method to select a reduced set of wavelengths from hyperspectral images.
We show that our method produces more suitable results for a multispectral sensor design.
arXiv Detail & Related papers (2021-06-01T17:24:35Z) - PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and
Localization [64.39761523935613]
We present a new framework for Patch Distribution Modeling, PaDiM, to concurrently detect and localize anomalies in images.
PaDiM makes use of a pretrained convolutional neural network (CNN) for patch embedding.
It also exploits correlations between the different semantic levels of CNN to better localize anomalies.
arXiv Detail & Related papers (2020-11-17T17:29:18Z) - Hierarchical Dynamic Filtering Network for RGB-D Salient Object
Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
In this paper, we explore these issues from a new perspective.
We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.