Deep Learning for Video Anomaly Detection: A Review
- URL: http://arxiv.org/abs/2409.05383v1
- Date: Mon, 9 Sep 2024 07:31:16 GMT
- Title: Deep Learning for Video Anomaly Detection: A Review
- Authors: Peng Wu, Chengyu Pan, Yuting Yan, Guansong Pang, Peng Wang, Yanning Zhang,
- Abstract summary: Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos.
In the era of deep learning, a great variety of deep learning based methods are constantly emerging for the VAD task.
This review covers the spectrum of five different categories, namely, semi-supervised, weakly supervised, fully supervised, unsupervised and open-set supervised VAD.
- Score: 52.74513211976795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos. As a long-standing task in the field of computer vision, VAD has witnessed much good progress. In the era of deep learning, with the explosion of architectures of continuously growing capability and capacity, a great variety of deep learning based methods are constantly emerging for the VAD task, greatly improving the generalization ability of detection algorithms and broadening the application scenarios. Therefore, such a multitude of methods and a large body of literature make a comprehensive survey a pressing necessity. In this paper, we present an extensive and comprehensive research review, covering the spectrum of five different categories, namely, semi-supervised, weakly supervised, fully supervised, unsupervised and open-set supervised VAD, and we also delve into the latest VAD works based on pre-trained large models, remedying the limitations of past reviews in terms of only focusing on semi-supervised VAD and small model based methods. For the VAD task with different levels of supervision, we construct a well-organized taxonomy, profoundly discuss the characteristics of different types of methods, and show their performance comparisons. In addition, this review involves the public datasets, open-source codes, and evaluation metrics covering all the aforementioned VAD tasks. Finally, we provide several important research directions for the VAD community.
Related papers
- Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Video Anomaly Detection in 10 Years: A Survey and Outlook [10.143205531474907]
Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring.
This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches.
arXiv Detail & Related papers (2024-05-29T17:56:31Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold
Learning [9.171175292808144]
We propose a novel manifold learning algorithm named Rank Flow Embedding (RFE) for unsupervised and semi-supervised scenarios.
RFE computes context-sensitive embeddings, which are refined following a rank-based processing flow.
The generated embeddings can be exploited for more effective unsupervised retrieval or semi-supervised classification.
arXiv Detail & Related papers (2023-04-24T21:02:12Z) - Generalized Video Anomaly Event Detection: Systematic Taxonomy and
Comparison of Deep Models [33.43062232461652]
Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems.
This survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED)
arXiv Detail & Related papers (2023-02-10T07:11:37Z) - An Overview of Violence Detection Techniques: Current Challenges and
Future Directions [8.978422921103617]
Violence Detection (VD) is used to analyze Big Video data for anomalous actions incurred due to humans.
This paper focuses on overview of deep sequence learning approaches along with localization strategies of the detected violence.
arXiv Detail & Related papers (2022-09-21T12:27:20Z) - An Empirical Study of End-to-End Temporal Action Detection [82.64373812690127]
Temporal action detection (TAD) is an important yet challenging task in video understanding.
Rather than end-to-end learning, most existing methods adopt a head-only learning paradigm.
We validate the advantage of end-to-end learning over head-only learning and observe up to 11% performance improvement.
arXiv Detail & Related papers (2022-04-06T16:46:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.