Fast Distance-based Anomaly Detection in Images Using an Inception-like
Autoencoder
- URL: http://arxiv.org/abs/2003.08731v1
- Date: Thu, 12 Mar 2020 16:10:53 GMT
- Title: Fast Distance-based Anomaly Detection in Images Using an Inception-like
Autoencoder
- Authors: Natasa Sarafijanovic-Djukic and Jesse Davis
- Abstract summary: A convolutional autoencoder (CAE) is trained to extract a low-dimensional representation of the images.
We employ a distanced-based anomaly detector in the low-dimensional space of the learned representation for the images.
We find that our approach resulted in improved predictive performance.
- Score: 16.157879279661362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of anomaly detection is to identify examples that deviate from
normal or expected behavior. We tackle this problem for images. We consider a
two-phase approach. First, using normal examples, a convolutional autoencoder
(CAE) is trained to extract a low-dimensional representation of the images.
Here, we propose a novel architectural choice when designing the CAE, an
Inception-like CAE. It combines convolutional filters of different kernel sizes
and it uses a Global Average Pooling (GAP) operation to extract the
representations from the CAE's bottleneck layer. Second, we employ a
distanced-based anomaly detector in the low-dimensional space of the learned
representation for the images. However, instead of computing the exact
distance, we compute an approximate distance using product quantization. This
alleviates the high memory and prediction time costs of distance-based anomaly
detectors. We compare our proposed approach to a number of baselines and
state-of-the-art methods on four image datasets, and we find that our approach
resulted in improved predictive performance.
Related papers
- Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering [84.37776381343662]
Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information.
We propose mip voxel grids (Mip-VoG), an explicit multiscale representation for real-time anti-aliasing rendering.
Our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously.
arXiv Detail & Related papers (2023-04-20T04:05:22Z) - FRE: A Fast Method For Anomaly Detection And Segmentation [5.0468312081378475]
This paper presents a principled approach for solving the visual anomaly detection and segmentation problem.
We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained DNN on the training data.
We show that the emphfeature reconstruction error (FRE), which is the $ell$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is extremely effective for anomaly detection.
arXiv Detail & Related papers (2022-11-23T01:03:20Z) - MLF-SC: Incorporating multi-layer features to sparse coding for anomaly
detection [2.2276675054266395]
Anomalies in images occur in various scales from a small hole on a carpet to a large stain.
One of the widely used anomaly detection methods, sparse coding, has an issue in dealing with anomalies that are out of the patch size employed to sparsely represent images.
We propose to incorporate multi-scale features to sparse coding and improve the performance of anomaly detection.
arXiv Detail & Related papers (2021-04-09T10:20:34Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution.
We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z) - Anomaly detection through latent space restoration using
vector-quantized variational autoencoders [0.8122270502556374]
We propose an out-of-distribution detection method using density and restoration-based approaches.
The VQ-VAE model learns to encode images in a categorical latent space.
The prior distribution of latent codes is then modelled using an Auto-Regressive (AR) model.
arXiv Detail & Related papers (2020-12-12T09:19:59Z) - Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy.
But their inference time is typically slow, on the order of seconds for a pair of 540p images.
We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image.
Our neural network processes a depth image and outputs a likelihood map in image coordinates.
We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z) - Optimized Feature Space Learning for Generating Efficient Binary Codes
for Image Retrieval [9.470008343329892]
We propose an approach for learning low dimensional optimized feature space with minimum intra-class variance and maximum inter-class variance.
We binarize our generated feature vectors with the popular Iterative Quantization (ITQ) approach and also propose an ensemble network to generate binary codes of desired bit length for image retrieval.
arXiv Detail & Related papers (2020-01-30T15:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.