Related papers: CRD: Collaborative Representation Distance for Practical Anomaly Detection

CRD: Collaborative Representation Distance for Practical Anomaly Detection

URL: http://arxiv.org/abs/2401.09443v1
Date: Wed, 20 Dec 2023 13:47:08 GMT
Title: CRD: Collaborative Representation Distance for Practical Anomaly Detection
Authors: Chao Han and Yudong Yan
Abstract summary: Patch based methods consider visual images as a collection of image patches according to positions. The nearest neighbor search for the query image and the stored patches will occupy $O(n)$ complexity in terms of time and space requirements. We propose an alternative approach to the distance calculation of image patches via collaborative representation models.
Score: 2.3931689873603603
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual defect detection plays an important role in intelligent industry. Patch based methods consider visual images as a collection of image patches according to positions, which have stronger discriminative ability for small defects in products, e.g. scratches on pills. However, the nearest neighbor search for the query image and the stored patches will occupy $O(n)$ complexity in terms of time and space requirements, posing strict challenges for deployment in edge environments. In this paper, we propose an alternative approach to the distance calculation of image patches via collaborative representation models. Starting from the nearest neighbor distance with $L_0$ constraint, we relax the constraint to $L_2$ constraint and solve the distance quickly in close-formed without actually accessing the original stored collection of image patches. Furthermore, we point out that the main computational burden of this close-formed solution can be pre-computed by high-performance server before deployment. Consequently, the distance calculation on edge devices only requires a simple matrix multiplication, which is extremely lightweight and GPU-friendly. Performance on real industrial scenarios demonstrates that compared to the existing state-of-the-art methods, this distance achieves several hundred times improvement in computational efficiency with slight performance drop, while greatly reducing memory overhead.

Related papers

SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer [62.11796778482088]
We present a novel model-agnostic sparse vision transformer, dubbed SparseFormer, to bridge the gap of object detection between close-up and HRW shots. The proposed SparseFormer selectively uses attentive tokens to scrutinize the sparsely distributed windows that may contain objects. experiments on two HRW benchmarks, PANDA and DOTA-v1.0, demonstrate that the proposed SparseFormer significantly improves detection accuracy (up to 5.8%) and speed (up to 3x) over the state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-11T03:21:25Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
We propose an algorithm that enables fast and high-quality generation under arbitrary constraints. During inference, we can interchange between gradient updates computed on the noisy image and updates computed on the final, clean image. Our approach produces results that rival or surpass the state-of-the-art training-free inference approaches.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP. VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone. Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z)
Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation [12.62718910894575]
Point cloud segmentation (PCS) plays an essential role in robot perception and navigation tasks. To efficiently understand large-scale outdoor point clouds, their range image representation is commonly adopted. However, undesirable missing values in the range images damage the shapes and patterns of objects. This problem creates difficulty for the models in learning coherent and complete geometric information from the objects.
arXiv Detail & Related papers (2024-05-16T15:13:42Z)
CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying [52.91778151771145]
In this paper, we try to break the limitations for the first time thanks to the recent development of continuous implicit representation. Experiments show that the proposed method achieves real-time performance on the 2048$times$2048 images using a single GTX 2080 Ti GPU.
arXiv Detail & Related papers (2023-03-15T11:13:51Z)
PATS: Patch Area Transportation with Subdivision for Local Feature Matching [78.67559513308787]
Local feature matching aims at establishing sparse correspondences between a pair of images. We propose Patch Area Transportation with Subdivision (PATS) to tackle this issue. PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks.
arXiv Detail & Related papers (2023-03-14T08:28:36Z)
Generating natural images with direct Patch Distributions Matching [7.99536002595393]
We develop an algorithm that explicitly and efficiently minimizes the distance between patch distributions in two images. Our results are often superior to single-image-GANs, require no training, and can generate high quality images in a few seconds.
arXiv Detail & Related papers (2022-03-22T16:38:52Z)
Patch-Based Stochastic Attention for Image Editing [4.8201607588546]
We propose an efficient attention layer based on the algorithm PatchMatch, which is used for determining approximate nearest neighbors. We demonstrate the usefulness of PSAL on several image editing tasks, such as image inpainting, guided image colorization, and single-image super-resolution.
arXiv Detail & Related papers (2022-02-07T13:42:00Z)
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture. It reduces computations by separating objects from background using a very lite first-stage. Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
Efficient Scene Compression for Visual-based Localization [5.575448433529451]
Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications. This work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP) Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.
arXiv Detail & Related papers (2020-11-27T18:36:06Z)
The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors. We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS. To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z)
A Survey on Patch-based Synthesis: GPU Implementation and Optimization [0.0]
This thesis surveys the research in patch-based synthesis and algorithms for finding correspondences between small local regions of images. One of the algorithms we have studied is PatchMatch, can find similar regions or "patches" of an image one to two orders of magnitude faster than previous techniques. In computer graphics, we have explored removing unwanted objects from images, seamlessly moving objects in images, changing image aspect ratios, and video summarization.
arXiv Detail & Related papers (2020-05-11T19:25:28Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.