Related papers: Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning

Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning

URL: http://arxiv.org/abs/2603.00550v1
Date: Sat, 28 Feb 2026 08:57:33 GMT
Title: Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning
Authors: Yu Wang, Shengjie Zhao,
Abstract summary: We propose a novel framework named LAS-VAD, short for Learning Anomaly Semantics for WS-VAD.<n>Our framework integrates anomaly-connected component mechanism and intention awareness mechanism.<n>It outperforms current state-of-the-art methods with remarkable gains.
Score: 23.043341269626016
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Weakly supervised video anomaly detection (WS-VAD) involves identifying the temporal intervals that contain anomalous events in untrimmed videos, where only video-level annotations are provided as supervisory signals. However, a key limitation persists in WS-VAD, as dense frame-level annotations are absent, which often leaves existing methods struggling to learn anomaly semantics effectively. To address this issue, we propose a novel framework named LAS-VAD, short for Learning Anomaly Semantics for WS-VAD, which integrates anomaly-connected component mechanism and intention awareness mechanism. The former is designed to assign video frames into distinct semantic groups within a video, and frame segments within the same group are deemed to share identical semantic information. The latter leverages an intention-aware strategy to distinguish between similar normal and abnormal behaviors (e.g., taking items and stealing). To further model the semantic information of anomalies, as anomaly occurrence is accompanied by distinct characteristic attributes (i.e., explosions are characterized by flames and thick smoke), we additionally incorporate anomaly attribute information to guide accurate detection. Extensive experiments on two benchmark datasets, XD-Violence and UCF-Crime, demonstrate that our LAS-VAD outperforms current state-of-the-art methods with remarkable gains.

Related papers

Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection [52.5174167737992]
Video anomaly detection (VAD) aims to identify abnormal events in videos.<n>We propose SteerVAD, which advances MLLM-based VAD by shifting from passively reading to actively steering and rectifying internal representations.<n>Our method achieves state-of-the-art performance among tuning-free approaches requiring only 1% of training data.
arXiv Detail & Related papers (2026-02-27T13:48:50Z)
Learning Event Completeness for Weakly Supervised Video Anomaly Detection [5.140169437190526]
We present a novel Learning Event Completeness for Weakly Supervised Video Anomaly Detection (LEC-VAD)<n>LEC-VAD encodes both category-aware and category-agnostic semantics between vision and language.<n>We develop a novel memory bank-based prototype learning mechanism to enrich concise text descriptions associated with anomaly-event categories.
arXiv Detail & Related papers (2025-06-16T04:56:58Z)
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs) Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos.<n>Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models.<n>We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z)
Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection [16.77262005540559]
A novel framework is proposed to guide the learning of suspected anomalies from event prompts. It enables a new multi-prompt learning process to constrain the visual-semantic features across all videos. Our proposed model outperforms most state-of-the-art methods in terms of AP or AUC.
arXiv Detail & Related papers (2024-03-02T10:42:47Z)
Dynamic Erasing Network Based on Multi-Scale Temporal Features for Weakly Supervised Video Anomaly Detection [103.92970668001277]
We propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection. We first propose a multi-scale temporal modeling module, capable of extracting features from segments of varying lengths. Then, we design a dynamic erasing strategy, which dynamically assesses the completeness of the detected anomalies.
arXiv Detail & Related papers (2023-12-04T09:40:11Z)
Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal. Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos. This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z)
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model [70.97446870672069]
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications. Video Anomaly Retrieval ( VAR) aims to pragmatically retrieve relevant anomalous videos by cross-modalities. We present two benchmarks, UCFCrime-AR and XD-Violence, constructed on top of prevalent anomaly datasets.
arXiv Detail & Related papers (2023-07-24T06:22:37Z)
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features [8.407188666535506]
Most existing methods use an autoencoder to learn to reconstruct normal videos. We propose an implicit two-path AE (ITAE), a structure in which two encoders implicitly model appearance and motion features. For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features. NF models intensify ITAE performance by learning normality through implicitly learned features.
arXiv Detail & Related papers (2020-10-15T05:02:02Z)
Localizing Anomalies from Weakly-Labeled Videos [45.58643708315132]
We propose a WeaklySupervised Anomaly localization (WSAL) method focusing on temporally localizing anomalous segments within anomalous videos. Inspired by the appearance difference in anomalous videos, the evolution of adjacent temporal segments is evaluated for the localization of anomalous segments. Our proposed method achieves new state-of-the-art performance on the UCF-Crime and TAD datasets.
arXiv Detail & Related papers (2020-08-20T12:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.