Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection
- URL: http://arxiv.org/abs/2508.06318v1
- Date: Fri, 08 Aug 2025 13:48:48 GMT
- Title: Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection
- Authors: Giacomo D'Amicantonio, Snehashis Majhi, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, François Bremond, Egor Bondarev,
- Abstract summary: Video Anomaly Detection (VAD) is a challenging task due to the variability of anomalous events and the limited availability of labeled data.<n>We propose a novel framework that employs a set of expert models, each specialized in capturing specific anomaly types.<n>Our approach achieves state-of-the-art performance, with a 91.58% AUC on the UCF-Crime dataset, and demonstrates superior results on XD-Violence and MSAD datasets.
- Score: 7.435598538875321
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video Anomaly Detection (VAD) is a challenging task due to the variability of anomalous events and the limited availability of labeled data. Under the Weakly-Supervised VAD (WSVAD) paradigm, only video-level labels are provided during training, while predictions are made at the frame level. Although state-of-the-art models perform well on simple anomalies (e.g., explosions), they struggle with complex real-world events (e.g., shoplifting). This difficulty stems from two key issues: (1) the inability of current models to address the diversity of anomaly types, as they process all categories with a shared model, overlooking category-specific features; and (2) the weak supervision signal, which lacks precise temporal information, limiting the ability to capture nuanced anomalous patterns blended with normal events. To address these challenges, we propose Gaussian Splatting-guided Mixture of Experts (GS-MoE), a novel framework that employs a set of expert models, each specialized in capturing specific anomaly types. These experts are guided by a temporal Gaussian splatting loss, enabling the model to leverage temporal consistency and enhance weak supervision. The Gaussian splatting approach encourages a more precise and comprehensive representation of anomalies by focusing on temporal segments most likely to contain abnormal events. The predictions from these specialized experts are integrated through a mixture-of-experts mechanism to model complex relationships across diverse anomaly patterns. Our approach achieves state-of-the-art performance, with a 91.58% AUC on the UCF-Crime dataset, and demonstrates superior results on XD-Violence and MSAD datasets. By leveraging category-specific expertise and temporal guidance, GS-MoE sets a new benchmark for VAD under weak supervision.
Related papers
- Labels Matter More Than Models: Quantifying the Benefit of Supervised Time Series Anomaly Detection [56.302586730134806]
Time series anomaly detection (TSAD) is a critical data mining task often constrained by label scarcity.<n>Current research predominantly focuses on Unsupervised Time-series Anomaly Detection.<n>This paper challenges the premise that architectural complexity is the optimal path for TSAD.
arXiv Detail & Related papers (2025-11-20T08:32:49Z) - Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time [60.341117019125214]
We propose a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns in graph anomaly detection (GAD)<n>To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level.<n>Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.
arXiv Detail & Related papers (2025-11-10T12:10:05Z) - THEMIS: Unlocking Pretrained Knowledge with Foundation Model Embeddings for Anomaly Detection in Time Series [0.0]
THEMIS is a new framework for time series anomaly detection that exploits pretrained knowledge from foundation models.<n>Our experiments show that this modular method achieves SOTA results on the MSL dataset and performs quite competitively on the SMAP and SWAT$*$ datasets.
arXiv Detail & Related papers (2025-10-04T19:20:35Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z) - Strengthening Anomaly Awareness [0.0]
We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection.<n>Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy.
arXiv Detail & Related papers (2025-04-15T16:52:22Z) - AMAD: AutoMasked Attention for Unsupervised Multivariate Time Series Anomaly Detection [0.7371521417300614]
AMAD integrates textbfAutotextbfMasked Attention for UMTStextbfAD scenarios.<n>AMAD provides a robust and adaptable solution to UMTSAD challenges.
arXiv Detail & Related papers (2025-04-09T07:32:59Z) - Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection [2.749898166276854]
weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction.<n>We propose a multi-modal WS-VAD framework to accurately detect anomalies such as violence and nudity.<n>We show that the proposed model achieves state-of-the-art results on benchmark datasets of violence and nudity detection.
arXiv Detail & Related papers (2024-12-29T12:46:57Z) - Generating and Reweighting Dense Contrastive Patterns for Unsupervised
Anomaly Detection [59.34318192698142]
We introduce a prior-less anomaly generation paradigm and develop an innovative unsupervised anomaly detection framework named GRAD.
PatchDiff effectively expose various types of anomaly patterns.
experiments on both MVTec AD and MVTec LOCO datasets also support the aforementioned observation.
arXiv Detail & Related papers (2023-12-26T07:08:06Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.