Future Video Prediction from a Single Frame for Video Anomaly Detection
- URL: http://arxiv.org/abs/2308.07783v1
- Date: Tue, 15 Aug 2023 14:04:50 GMT
- Title: Future Video Prediction from a Single Frame for Video Anomaly Detection
- Authors: Mohammad Baradaran, Robert Bergevin
- Abstract summary: Video anomaly detection (VAD) is an important but challenging task in computer vision.
We introduce the task of future frame prediction proxy-task, as a novel proxy-task for video anomaly detection.
This proxy-task alleviates the challenges of previous methods in learning longer motion patterns.
- Score: 0.38073142980732994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video anomaly detection (VAD) is an important but challenging task in
computer vision. The main challenge rises due to the rarity of training samples
to model all anomaly cases. Hence, semi-supervised anomaly detection methods
have gotten more attention, since they focus on modeling normals and they
detect anomalies by measuring the deviations from normal patterns. Despite
impressive advances of these methods in modeling normal motion and appearance,
long-term motion modeling has not been effectively explored so far. Inspired by
the abilities of the future frame prediction proxy-task, we introduce the task
of future video prediction from a single frame, as a novel proxy-task for video
anomaly detection. This proxy-task alleviates the challenges of previous
methods in learning longer motion patterns. Moreover, we replace the initial
and future raw frames with their corresponding semantic segmentation map, which
not only makes the method aware of object class but also makes the prediction
task less complex for the model. Extensive experiments on the benchmark
datasets (ShanghaiTech, UCSD-Ped1, and UCSD-Ped2) show the effectiveness of the
method and the superiority of its performance compared to SOTA prediction-based
VAD methods.
Related papers
- Vision-Language Models Assisted Unsupervised Video Anomaly Detection [3.1095294567873606]
Anomaly samples present significant challenges for unsupervised learning methods.
Our method employs a cross-modal pre-trained model that leverages the inferential capabilities of large language models.
By mapping high-dimensional visual features to low-dimensional semantic ones, our method significantly enhance the interpretability of unsupervised anomaly detection.
arXiv Detail & Related papers (2024-09-21T11:48:54Z) - Predicting Long-horizon Futures by Conditioning on Geometry and Time [49.86180975196375]
We explore the task of generating future sensor observations conditioned on the past.
We leverage the large-scale pretraining of image diffusion models which can handle multi-modality.
We create a benchmark for video prediction on a diverse set of videos spanning indoor and outdoor scenes.
arXiv Detail & Related papers (2024-04-17T16:56:31Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Layout Sequence Prediction From Noisy Mobile Modality [53.49649231056857]
Trajectory prediction plays a vital role in understanding pedestrian movement for applications such as autonomous driving and robotics.
Current trajectory prediction models depend on long, complete, and accurately observed sequences from visual modalities.
We propose LTrajDiff, a novel approach that treats objects obstructed or out of sight as equally important as those with fully visible trajectories.
arXiv Detail & Related papers (2023-10-09T20:32:49Z) - A Lightweight Video Anomaly Detection Model with Weak Supervision and Adaptive Instance Selection [14.089888316857426]
This paper focuses on weakly supervised video anomaly detection.
We develop a lightweight video anomaly detection model.
We show that our model can achieve comparable or even superior AUC score compared to the state-of-the-art methods.
arXiv Detail & Related papers (2023-10-09T01:23:08Z) - Exploring Diffusion Models for Unsupervised Video Anomaly Detection [17.816344808780965]
This paper investigates the performance of diffusion models for video anomaly detection (VAD)
Experiments performed on two large-scale anomaly detection datasets demonstrate the consistent improvement of the proposed method over the state-of-the-art generative models.
This is the first study using a diffusion model to present guidance for examining VAD in surveillance scenarios.
arXiv Detail & Related papers (2023-04-12T13:16:07Z) - Spatio-temporal predictive tasks for abnormal event detection in videos [60.02503434201552]
We propose new constrained pretext tasks to learn object level normality patterns.
Our approach consists in learning a mapping between down-scaled visual queries and their corresponding normal appearance and motion characteristics.
Experiments on several benchmark datasets demonstrate the effectiveness of our approach to localize and track anomalies.
arXiv Detail & Related papers (2022-10-27T19:45:12Z) - Multi-Contextual Predictions with Vision Transformer for Video Anomaly
Detection [22.098399083491937]
understanding of thetemporal context of a video plays a vital role in anomaly detection.
We design a transformer model with three different contextual prediction streams: masked, whole and partial.
By learning to predict the missing frames of consecutive normal frames, our model can effectively learn various normality patterns in the video.
arXiv Detail & Related papers (2022-06-17T05:54:31Z) - Object Class Aware Video Anomaly Detection through Image Translation [1.2944868613449219]
This paper proposes a novel two-stream object-aware VAD method that learns the normal appearance and motion patterns through image translation tasks.
The results show that, as significant improvements to previous methods, detections by our method are completely explainable and anomalies are localized accurately in the frames.
arXiv Detail & Related papers (2022-05-03T18:04:27Z) - Object-centric and memory-guided normality reconstruction for video
anomaly detection [56.64792194894702]
This paper addresses anomaly detection problem for videosurveillance.
Due to the inherent rarity and heterogeneity of abnormal events, the problem is viewed as a normality modeling strategy.
Our model learns object-centric normal patterns without seeing anomalous samples during training.
arXiv Detail & Related papers (2022-03-07T19:28:39Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.