Related papers: Vision Foundation Model Embedding-Based Semantic Anomaly Detection

Vision Foundation Model Embedding-Based Semantic Anomaly Detection

URL: http://arxiv.org/abs/2505.07998v1
Date: Mon, 12 May 2025 19:00:29 GMT
Title: Vision Foundation Model Embedding-Based Semantic Anomaly Detection
Authors: Max Peter Ronecker, Matthew Foutter, Amine Elhafsi, Daniele Gammelli, Ihor Barakaiev, Marco Pavone, Daniel Watzenig,
Abstract summary: This work explores semantic anomaly detection by leveraging the semantic priors of state-of-the-art vision foundation models.<n>We propose a framework that compares local vision embeddings from runtime images to a database of nominal scenarios in which the autonomous system is deemed safe and performant.
Score: 12.940376547110509
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantic anomalies are contextually invalid or unusual combinations of familiar visual elements that can cause undefined behavior and failures in system-level reasoning for autonomous systems. This work explores semantic anomaly detection by leveraging the semantic priors of state-of-the-art vision foundation models, operating directly on the image. We propose a framework that compares local vision embeddings from runtime images to a database of nominal scenarios in which the autonomous system is deemed safe and performant. In this work, we consider two variants of the proposed framework: one using raw grid-based embeddings, and another leveraging instance segmentation for object-centric representations. To further improve robustness, we introduce a simple filtering mechanism to suppress false positives. Our evaluations on CARLA-simulated anomalies show that the instance-based method with filtering achieves performance comparable to GPT-4o, while providing precise anomaly localization. These results highlight the potential utility of vision embeddings from foundation models for real-time anomaly detection in autonomous systems.

Related papers

Zero-Shot Image Anomaly Detection Using Generative Foundation Models [2.241618130319058]
This research explores the use of score-based generative models as foundational tools for semantic anomaly detection.<n>By analyzing Stein score errors, we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset.<n>Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution.
arXiv Detail & Related papers (2025-07-30T13:56:36Z)
Verification of Visual Controllers via Compositional Geometric Transformations [49.81690518952909]
We introduce a novel verification framework for perception-based controllers that can generate outer-approximations of reachable sets.<n>We provide theoretical guarantees on the soundness of our method and demonstrate its effectiveness across benchmark control environments.
arXiv Detail & Related papers (2025-07-06T20:22:58Z)
Behavioral Anomaly Detection in Distributed Systems via Federated Contrastive Learning [0.8906214436849201]
The goal is to overcome the limitations of traditional centralized approaches in terms of data privacy, node heterogeneity, and anomaly pattern recognition.<n>The proposed method combines the distributed collaborative modeling capabilities of federated learning with the feature discrimination enhancement of contrastive learning.<n>It builds embedding representations on local nodes and constructs positive and negative sample pairs to guide the model in learning a more discriminative feature space.
arXiv Detail & Related papers (2025-06-24T02:04:44Z)
From Controlled Scenarios to Real-World: Cross-Domain Degradation Pattern Matching for All-in-One Image Restoration [2.997052569698842]
All-in-One Image Restoration (AiOIR) aims to achieve image restoration caused by multiple degradation patterns via a single model with unified parameters.<n>UDAIR framework is proposed to effectively achieve AiOIR by leveraging the learned knowledge from source domain to target domain.<n> Experimental results on 10 open-source datasets demonstrate that UDAIR achieves new state-of-the-art performance for the AiOIR task.
arXiv Detail & Related papers (2025-05-28T12:22:00Z)
MeLIAD: Interpretable Few-Shot Anomaly Detection with Metric Learning and Entropy-based Scoring [2.394081903745099]
We propose MeLIAD, a novel methodology for interpretable anomaly detection. MeLIAD is based on metric learning and achieves interpretability by design without relying on any prior distribution assumptions of true anomalies. Experiments on five public benchmark datasets, including quantitative and qualitative evaluation of interpretability, demonstrate that MeLIAD achieves improved anomaly detection and localization performance.
arXiv Detail & Related papers (2024-09-20T16:01:43Z)
GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings. We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features. We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z)
PUAD: Frustratingly Simple Method for Robust Anomaly Detection [0.0]
We argue that logical anomalies, such as the wrong number of objects, can not be well-represented by the spatial feature maps. We propose a method that incorporates a simple out-of-distribution detection method on the feature space against state-of-the-art reconstruction-based approaches. Our method achieves state-of-the-art performance on the MVTec LOCO AD dataset.
arXiv Detail & Related papers (2024-02-23T06:57:31Z)
Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs) Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV. Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z)
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images. We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z)
Self-Calibrating Anomaly and Change Detection for Autonomous Inspection Robots [0.07366405857677225]
A visual anomaly or change detection algorithm identifies regions of an image that differ from a reference image or dataset. We propose a comprehensive deep learning framework for detecting anomalies and changes in a priori unknown environments.
arXiv Detail & Related papers (2022-08-26T09:52:12Z)
Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z)
Object-centric and memory-guided normality reconstruction for video anomaly detection [56.64792194894702]
This paper addresses anomaly detection problem for videosurveillance. Due to the inherent rarity and heterogeneity of abnormal events, the problem is viewed as a normality modeling strategy. Our model learns object-centric normal patterns without seeing anomalous samples during training.
arXiv Detail & Related papers (2022-03-07T19:28:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.