PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge
- URL: http://arxiv.org/abs/2410.11591v1
- Date: Tue, 15 Oct 2024 13:25:43 GMT
- Title: PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge
- Authors: Manuel Barusco, Francesco Borsatti, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto,
- Abstract summary: Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly.
Despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices.
This work addresses this gap by leveraging lightweight neural networks to reduce memory and requirements, enabling VAD deployment on resource-constrained edge devices.
- Score: 6.643376250301589
- License:
- Abstract: Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly. A key advantage of VAD is its unsupervised nature, which eliminates the need for costly and time-consuming labeled data collection. However, despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices. This work addresses this gap by leveraging lightweight neural networks to reduce memory and computation requirements, enabling VAD deployment on resource-constrained edge devices. We benchmark the major VAD algorithms within this framework and demonstrate the feasibility of edge-based VAD using the well-known MVTec dataset. Furthermore, we introduce a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach. Our results show that PaSTe decreases the inference time by 25%, while reducing the training time by 33% and peak RAM usage during training by 76%. These improvements make the VAD process significantly more efficient, laying a solid foundation for real-world deployment on edge devices.
Related papers
- Structured Pruning for Efficient Visual Place Recognition [24.433604332415204]
Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices.
Our work introduces a novel structured pruning method to streamline common VPR architectures.
This dual focus significantly enhances the efficiency of the system, reducing both map and model memory requirements and decreasing feature extraction and retrieval latencies.
arXiv Detail & Related papers (2024-09-12T08:32:25Z) - ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation [6.856317526681759]
Visual place recognition plays a pivotal role in autonomous exploration and navigation of mobile robots.
Existing methods overcome this by exploiting powerful yet large networks.
We propose a high-performance teacher and lightweight student distillation framework called TSCM.
arXiv Detail & Related papers (2024-04-02T02:29:41Z) - Design Space Exploration of Low-Bit Quantized Neural Networks for Visual
Place Recognition [26.213493552442102]
Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems.
Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization.
This has resulted in methods that use deep learning models too large to deploy on low powered edge devices.
We study the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance.
arXiv Detail & Related papers (2023-12-14T15:24:42Z) - Efficient Representation of the Activation Space in Deep Neural Networks [5.224743522146324]
We propose a model-agnostic framework for creating representations of activations in deep neural networks.
The framework reduces memory usage by 30% with up to 4 times faster p-value computing time.
As we do not persist raw data at inference time, we could potentially reduce susceptibility to attacks and privacy issues.
arXiv Detail & Related papers (2023-12-13T13:46:14Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources.
We build a unified framework for efficient end-to-end temporal action detection (ETAD)
ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z) - Cross-modal Knowledge Distillation for Vision-to-Sensor Action
Recognition [12.682984063354748]
This study introduces an end-to-end Vision-to-Sensor Knowledge Distillation (VSKD) framework.
In this VSKD framework, only time-series data, i.e., accelerometer data, is needed from wearable devices during the testing phase.
This framework will not only reduce the computational demands on edge devices, but also produce a learning model that closely matches the performance of the computational expensive multi-modal approach.
arXiv Detail & Related papers (2021-10-08T15:06:38Z) - Identity-Aware Attribute Recognition via Real-Time Distributed Inference
in Mobile Edge Clouds [53.07042574352251]
We design novel models for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system.
We propose a novel inference framework with a set of distributed modules, by jointly considering the attribute recognition and person re-ID.
We then devise a learning-based algorithm for the distributions of the modules of the proposed distributed inference framework.
arXiv Detail & Related papers (2020-08-12T12:03:27Z) - Rethinking Performance Estimation in Neural Architecture Search [191.08960589460173]
We provide a novel yet systematic rethinking of performance estimation (PE) in a resource constrained regime.
By combining BPE with various search algorithms including reinforcement learning, evolution algorithm, random search, and differentiable architecture search, we achieve 1, 000x of NAS speed up with a negligible performance drop.
arXiv Detail & Related papers (2020-05-20T09:01:44Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.