Related papers: From Benchmarks to Reality: Advancing Visual Anomaly Detection by the VAND 3.0 Challenge

From Benchmarks to Reality: Advancing Visual Anomaly Detection by the VAND 3.0 Challenge

URL: http://arxiv.org/abs/2509.17615v1
Date: Mon, 22 Sep 2025 11:27:49 GMT
Title: From Benchmarks to Reality: Advancing Visual Anomaly Detection by the VAND 3.0 Challenge
Authors: Lars Heckler-Kram, Ashwin Vaidya, Jan-Hendrik Neudeck, Ulla Scheler, Dick Ameln, Samet Akcay, Paula Ramos,
Abstract summary: We present the VAND 3.0 Challenge to showcase current progress in anomaly detection.<n>The challenge hosted two tracks, fostering the development of anomaly detection methods robust against real-world distribution shifts.<n>The participants' solutions reached significant improvements over previous baselines by combining or adapting existing approaches and fusing them with novel pipelines.
Score: 4.03804045800094
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Visual anomaly detection is a strongly application-driven field of research. Consequently, the connection between academia and industry is of paramount importance. In this regard, we present the VAND 3.0 Challenge to showcase current progress in anomaly detection across different practical settings whilst addressing critical issues in the field. The challenge hosted two tracks, fostering the development of anomaly detection methods robust against real-world distribution shifts (Category 1) and exploring the capabilities of Vision Language Models within the few-shot regime (Category 2), respectively. The participants' solutions reached significant improvements over previous baselines by combining or adapting existing approaches and fusing them with novel pipelines. While for both tracks the progress in large pre-trained vision (language) backbones played a pivotal role for the performance increase, scaling up anomaly detection methods more efficiently needs to be addressed by future research to meet real-time and computational constraints on-site.

Related papers

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning [62.09195763860549]
Reinforcement learning with verifiable rewards (RLVR) improves reasoning in large language models (LLMs) but struggles with exploration.<n>We introduce $textbfVOGUE (Visual Uncertainty Guided Exploration)$, a novel method that shifts exploration from the output (text) to the input (visual) space.<n>Our work shows that grounding exploration in the inherent uncertainty of visual inputs is an effective strategy for improving multimodal reasoning.
arXiv Detail & Related papers (2025-10-01T20:32:08Z)
DINO-CoDT: Multi-class Collaborative Detection and Tracking with Vision Foundation Models [11.34839442803445]
We propose a multi-class collaborative detection and tracking framework tailored for diverse road users.<n>We first present a detector with a global spatial attention fusion (GSAF) module, enhancing multi-scale feature learning for objects of varying sizes.<n>Next, we introduce a tracklet RE-IDentification (REID) module that leverages visual semantics with a vision foundation model to effectively reduce ID SWitch (IDSW) errors.
arXiv Detail & Related papers (2025-06-09T02:49:10Z)
Enhancing Abnormality Identification: Robust Out-of-Distribution Strategies for Deepfake Detection [2.4851820343103035]
We propose two novel Out-Of-Distribution (OOD) detection approaches.<n>The first approach is trained to reconstruct the input image, while the second incorporates an attention mechanism for detecting OODs.<n>Our method achieves promising results in deepfake detection and ranks among the top-performing configurations on the benchmark.
arXiv Detail & Related papers (2025-06-03T13:24:33Z)
Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift [51.24522135151649]
Anomaly detection plays a crucial role in quality control for industrial applications.<n>Existing methods attempt to address domain shifts by training generalizable models.<n>Our proposed method demonstrates superior results compared with state-of-the-art anomaly detection and domain adaptation methods.
arXiv Detail & Related papers (2025-03-19T05:25:52Z)
Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions [0.017476232824732776]
Time-series anomaly detection plays an important role in engineering processes. This survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made. It presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis.
arXiv Detail & Related papers (2024-08-07T13:01:10Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [89.92916473403108]
This paper proposes a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.<n>The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.<n>We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
Transformer-based Multimodal Change Detection with Multitask Consistency Constraints [10.906283981247796]
Current change detection methods struggle with the multitask conflicts between semantic and height change detection tasks. We propose an efficient Transformer-based network that learns shared representation between cross-dimensional inputs through cross-attention. Compared to five state-of-the-art change detection methods, our model demonstrates consistent multitask superiority in terms of semantic and height change detection.
arXiv Detail & Related papers (2023-10-13T17:38:45Z)
Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency [90.71745178767203]
Deep learning-based 3D object detection has achieved unprecedented success with the advent of large-scale autonomous driving datasets. Existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world. We study a more realistic setting, unsupervised 3D domain adaptive detection, which only utilizes source domain annotations.
arXiv Detail & Related papers (2021-07-23T17:19:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.